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Editor's Preface 


The international community has agreed that gender equality is one of the main topics in the 
development agenda. The Millennium Development Goal number 3, for example, is "promo- 
te gender equality and empower women". Considerable effort has been devoted recently to 
quantify and understand gender inequalities at the cross-country level, but most of the exis- 
ting research concentrates on inequalities based on outcome measures such as education, 
health or participation, while the institutional basis of these inequalities is often overlooked. 
For policy action, however, understanding these institutional drivers of gender inequality 
seems crucial. This book makes a significant contribution in this regard. 


In part 1 of this book, Boris Branisa makes a twofold contribution to the discussion of 
gender issues and development. The first is related to the measurement and understanding of 
what Amartya Sen calls substantive freedoms. Under this approach, the success of a society 
is to be evaluated primarily by the substantive freedoms that people enjoy. In essay 1 Branisa 
explores the measurement of social institutions related to gender inequality. These instituti- 
ons are understood as long-lasting norms, values and codes of conduct that shape everyday 
life and determine role models that people try to fulfill and satisfy, and as such they are es- 
sential to understand gender roles. He uses variables from the OECD Development Centre's 
Gender, Institutions and Development database and proposes several composite measures of 
social institutions related to gender inequality. Five subindices combine variables that proxy 
one dimension of social institutions: Family code, Civil liberties, Physical integrity, Son pre- 
ference and Ownership rights. The aggregation procedure is based on polychoric principal 
component analysis. The five one-dimensional measures are then combined to construct the 
Social Institutions and Gender Index (SIGI) which is a multidimensional measure of soci- 
al institutions related to gender inequality. The aggregation of the dimensions follows the 
Foster-Greer-Thorbecke approach to poverty measurement. The SIGI and the five composite 
measures are helpful to understand the deprivation of women and allow ranking and compa- 
ring over 100 developing countries as well as to identify priority areas where action is needed 
in a given country. The essay also shows that these measures complement existing measures 


and indicators of gender inequality. 
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The second main contribution of part 1 is related to another idea championed by Amartya 
Sen: Going beyond the intrinsic importance of freedom as the objective of development, one 
should also consider the instrumental effectiveness of freedom of different kinds to promote 
human freedom, as greater freedom means that people can exert more influence in their li- 
ves and at the societal level. In essays 2 and 3, Branisa examines some interesting empirical 
connections at the cross-country level between social institutions related to gender inequality 
measured by the composite indices proposed in essay 1, and relevant development outcomes. 
Essay 2 reviews some of the existing theoretical literature such as household bargaining mo- 
dels and formulates hypothesis about the potentials effects of social institutions related to 
gender inequality on female education, child mortality, fertility, and governance measured 
as rule of law and governance. The empirical results show that among developing coun- 
tries higher inequality in social institutions is associated with worst development outcomes, 
even after accounting for differences in religion, geography, political system, and the level 
of income. The focus of essay 3 is the link between social institutions related to gender ine- 
quality, and corruption. The study contributes to the existing literature on the topic showing 
that when the opportunities of women to participate in social life are restricted in developing 
countries, the perceived level of corruption tends to be higher. This empirical result is robust, 
and holds when one accounts for other possible factors that influence corruption. 


Part 2 of this book is concerned with the evolution over time of another type of inequali- 
ty which is also pertinent for most developing countries, namely inequality between regions. 
Branisa specifically deals with the question of regional convergence among departments in 
Colombia in the last quarter of the 20th century understood as whether departments that we- 
re lagging behind the national average have been able to catch up in that period. Essay 4 
presents a sound review of the concepts and of the main econometric approaches to measure 
convergence empirically, and explores the Colombian case discussing crucial data issues and 
focusing on the two existing yearly time series of consistent per capita income measures: 
gross departmental product and gross household disposable income. The results suggest no 
convergence if one relies on gross departmental product, and that only a very slow conver- 
gence took place if one observes gross household disposable income. 


Essay 5 examines convergence among departments in Colombia during a similar peri- 
od, but concentrating on alternative non-income indicators. As suggested among others by 
Amartya Sen, it is important to go beyond income measures and focus on social opportuni- 
ties which contribute to the overall freedom that people have to live as they choose. Branisa 
discusses relevant public policies and major reforms put in place in Colombia during the 
period, as well as data and measurement issues concerning social indicators, and empirically 


examines convergence using variables reflecting outcomes related to education, health and 
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nourishment. The main results show on one hand that there has been convergence in ba- 
sic education. On the other hand, no robust evidence of convergence is found using health 
measures, which seems consistent with the results of essay 4. 

Taken together, the essays in this book make an important contribution to the understan- 
ding of gender and regional inequalities and are of interest to scholars and policy-makers 
alike. 


Prof. Stephan Klasen, Ph.D. 
Góttingen, November 2011 
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Introduction and Overview 


This book is a collection of five empirical essays and is divided into two independent parts. 
Part one comprises three essays that deal with social institutions related to gender inequal- 
ity at the cross-country level. Part two investigates whether there was convergence across 
Colombian departments during the last quarter of the 20th century. 


Part I: Social institutions and gender inequality 


The importance of striving for gender equality has been recognized and incorporated in the 
international development agenda, e.g. in Millennium Development Goal 3 *Promote gender 
equality and empower women" or in the Convention on the Elimination of All Forms of 
Discrimination against Women (CEDAW). Nevertheless, when it comes to measurement 
of gender inequality at the cross-country level, most of the attention centers on measures 
that proxy gender inequality in well-being or in agency, and which are typically outcome- 
focused (Klasen, 2006, 2007). Focusing only on outcomes neglects the relevant question of 
the origins of these inequalities and their great heterogeneity. Gender inequality is the result 
of human behavior, and how people behave and interact is influenced by institutions. Hence, 
to understand gender inequality in outcomes, one needs to study the institutional basis of 
gender inequality. 

In Essay 1 we propose new composite measures that proxy social institutions related to 
gender inequality in non-OECD countries based on variables of the OECD Gender, Institu- 
tions and Development database (Morrison and Jütting, 2005; Jütting et al., 2008). We aggre- 
gate the variables into five subindices that each measure one dimension of social institutions 
related to gender inequality (Family code, Civil liberties, Physical integrity, Son preference 
and Ownership rights). We combine the subindices into the Social Institutions and Gender 
Index (SIGI) as a multidimensional measure of the deprivation of women caused by social 
institutions. Methodologically, the SIGI is inspired by the Foster-Greer-Thorbecke poverty 
measures. It offers a new way of aggregating gender inequality in several dimensions, penal- 
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izing high inequality in each dimension and allowing only for partial compensation between 


dimensions. 


The SIGI and the subindices are useful tools to compare the societal situation of women 
in over 100 non-OECD countries from a new perspective, allowing the identification of prob- 
lematic countries and dimensions of social institutions that deserve attention by policy mak- 
ers and need to be scrutinized in detail. Empirical results show that the SIGI provides addi- 
tional information to that of other well-known gender-related indices. Moreover, regression 
analysis shows that the SIGI is related to indices that measure outcome gender inequality, 


even if one takes into account region, religion and level of economic development. 


Institutions are a major factor explaining development outcomes in general. Essay 2 fo- 
cuses on social institutions related to gender inequality understood as long-lasting norms, 
values and codes of conduct that shape gender roles, and presents evidence on why they 
matter for development. We derive hypotheses from existing theories and empirically test 
them at the cross-country level with linear regressions using the SIGI and its subindices as 
measures for social institutions. We find that apart from geography, political system, reli- 
gion, and the level of economic development, one has to consider social institutions related 
to gender inequality to better account for differences in development. Our results show that 
social institutions that deprive women of their autonomy and bargaining power in the house- 
hold, or that increase the private costs and reduce the private returns to investments into girls, 
are associated with lower female education, higher fertility rates and higher child mortality. 
Moreover, social institutions related to gender inequality are negatively associated with gov- 


ernance measured as ‘rule of law’ and ‘voice and accountability’. 


Essay 3 reexamines the link between gender inequality and corruption. We review the 
literature on the relationship between representation of women in economic and political 
life, democracy and corruption, and bring in a new previously omitted variable that captures 
the level of discrimination against women in a society: social institutions related to gender 
inequality. Using a sample of developing countries we regress corruption on the represen- 
tation of women, democracy and other control variables. Then we add the subindex Civil 
liberties proposed in Essay 1, as it covers social institutions that directly shape the oppor- 
tunities of women to participate in social life. The results show that corruption is higher in 
countries where social institutions deprive women of their freedom to participate in social 
life, even accounting for democracy and representation of women in political and economic 
life as well as for other variables. Our findings suggest that, in a context where social values 
disadvantage women, it might not be enough to push democratic reforms and to increase the 


participation of women to reduce corruption. 
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Part II: Regional growth convergence in Colombia 


Colombia is the third most populated country in Latin America. According to its macroe- 
conomic performance, it is considered in general a successful story in the region. It is one 
of the few countries that did not default on its external debt during the ‘lost decade’ of last 
century and which did not experience hyperinflation. The annual growth rate of per capita 
GDP between 1975 and 2005 was 1.4 percent, which is twice as much as the Latin American 
average in the same period. If one takes a broader perspective of development and focuses 
on other indicators as education or health, Colombia is close to or slightly above the Latin 
American average. 

However, Colombia is also well-known for large regional disparities in income and in 
social indicators. In Essays 4 and 5, we focus on departments, which are important political 
entities in the country, with elected local governments and separate department assemblies. 
We investigate whether there was convergence among them, i.e. if departments that were 
lagging behind have been able to catch up during the last quarter of the 20th century. 

Essay 4 focuses on growth convergence across Colombian departments during the period 
of 1975 to 2000, following both the regression and the distributional approaches suggested 
in the literature, and using two income measures computed by Centro de Estudios Ganaderos 
(CEGA). We also discuss issues related to data provided by Departamento Administrativo 
Nacional de Estadísticas (DANE) used by previous convergence studies. Our results show 
no evidence supporting convergence using per capita gross departmental product, but rather 
persistence in the distribution. Using per capita gross household disposable income, we 
find some evidence of convergence, but only at a low speed, close to one percent per year. 
Furthermore, we find no evidence of the existence of different steady states for the two 
variables considered. 

Essay 5 investigates convergence in social indicators among Colombian departments from 
1973 to 2005. We use census data and apply both the regression approach and the distribu- 
tional approach (univariate and bivariate kernel density estimators). Using literacy rate as 
a proxy for education, we find convergence between 1973 and 2005, but persistence in the 
distribution between 1975 and 2000, when we use the infant survival rate and life expectancy 
at birth as proxies for health. Additionally, using data from Demographic and Health Sur- 
veys, we find some evidence of convergence in the rate of children that are well-nourished 
between 1995 and 2005. 


Part I 


Social institutions and gender inequality 


Essay 1 


The Institutional Basis of Gender 
Inequality: The Social Institutions and 
Gender Index (SIGI) and its Subindices 


Abstract 


In this paper we construct the Social Institutions and Gender Index (SIGI) and its five 
subindices Family code, Civil liberties, Physical integrity, Son Preference and Ownership 
rights using variables of the OECD Development Centre's Gender, Institutions and Devel- 
opment database. Instead of measuring gender inequality in education, health, economic or 
political participation, these indices allow a new perspective on gender issues in develop- 
ing countries. The SIGI and the subindices measure long-lasting social institutions which 
are mirrored by societal practices and legal norms that frame gender-relevant meanings and 
form the basis of gender roles. The subindices measure each one dimension of the con- 
cept and the SIGI combines the subindices into a multidimensional index of deprivation of 
women caused by social institutions. Methodologically, the SIGI is inspired by the Foster- 
Greer-Thorbecke poverty measures. It offers a new way of aggregating gender inequality 
in several dimensions, penalizing high inequality in each dimension and allowing only for 
partial compensation between dimensions. The SIGI and the subindices are useful tools 
to identify countries and dimensions of social institutions that deserve attention. Empirical 
results confirm that the SIGI provides additional information to that of other well-known 


gender-related indices. 


Based on joint work with Stephan Klasen and Maria Ziegler. 
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1.1 Introduction 


Despite considerable progress in recent decades, gender inequality in the manifold dimen- 
sions of well-being remains pervasive in many developing countries. This is an intrinsic 
issue of equity as the affected women are deprived of their basic freedoms (Sen, 1999). But 
going beyond this intrinsic feature of gender inequality, there is considerable evidence that 
it implies high costs for society in the form of lower human capital, worse governance, and 
lower growth (e.g. World Bank, 2001; Klasen, 2002; Klasen and Lamanna, 2009). The in- 
trinsic and instrumental value of gender equality has been recognized and incorporated in 
the development agenda, for example in Millennium Development Goal 3 “Promote gender 
equality and empower women" or in the Convention on the Elimination of All Forms of 
Discrimination against Women (CEDAW). 

To measure the extent of this problem at the cross-country level several gender-related in- 
dices have been proposed, e.g. the Gender-Related Development Index (GDI) and the Gen- 
der Empowerment Measure (GEM) (United Nations Development Programme, 1995) and 
more recently the Gender Inequality Index (GII) (United Nations Development Programme, 
2010), the Global Gender Gap Index (GGG) from the World Economic Forum (Lopez-Claros 
and Zahidi, 2005), the Gender Equity Index developed by Social Watch (2005) or African 
Gender and Development Index (AGDI) proposed by the Economic Commission for Africa 
(2004). These measures focus on gender inequality in well-being or in agency and they are 
typically outcome-focused (Klasen, 2006, 2007). Focusing only on outcomes neglects the 
question of the origins of these inequalities and their great heterogeneity across space and 
time. Gender inequality is the result of human behavior, and how people behave and interact 
is influenced by institutions. Thus to understand gender inequality in outcomes, one needs 
to study the institutional basis of gender inequality. 

There are several approaches to institutions. According to North (1990, p. 3 ff.) “insti- 
tutions are the rules of the game in a society", they are “humanly devised constraints that 
shape human interaction". From an economics perspective, institutions are conceived as 
the result of collective choices in a society to achieve gains from cooperation by reducing 
uncertainty, collective action dilemmas and transaction costs. A sociological or cultural per- 
spective, which is complementary to the rational choice one, relates institutions to culture. 
Institutions in this sense frame meanings and beliefs. People try to satisfy norms rather than 
to act individually within the rules of the game, i.e. institutions do not canalize preferences 
of actors, they influence the preferences and shape the role models and identities of the ac- 
tors themselves. Actors and institutions amalgamate so that actors are often not aware of 
the guiding principles of their behavior. Legitimacy and appropriateness drive institutional 


evolution more than efficiency considerations. Cultural authority, power in a society and 
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community dynamics might be more relevant in shaping such institutions that become taken- 
for-granted without continuously being evaluated against efficiency considerations (Hall and 
Taylor, 1996, and references therein). 

There is a particular type of institutions that is relevant for gender inequality, social insti- 
tutions related to gender inequality. These institutions are more embedded in the cultural- 
sociological account although efficiency issues may also matter. We conceive these social 
institutions as long-lasting norms, values and codes of conduct that find expression in tra- 
ditions, customs and cultural practices, informal and formal laws. They are at the bottom 
of gender roles and the distribution of power between men and women in the family, in the 
market and in social and political life. As social institutions related to gender inequality build 
an often taken-for-granted basis of people's behavior and interaction in all spheres of life, 
they shape the social and economic opportunities of men and women, their autonomy in tak- 
ing decisions (Dyson and Moore, 1983; Abadian, 1996; Hindin, 2000; Bloom et al., 2001) 
or their capabilities to live the life they value (Sen, 1999). That is why they might affect 
important development outcomes and contribute to outcome gender inequalities (De Soysa 
and Jütting, 2007). 

There are three measures at the country level that somehow proxy social institutions, 
which determine how women are treated in society: the Women's Political Rights index 
(WOPOL), the Women's Economic Rights index (WECON), and the Women’s Social Rights 
index (WOSOC) of the CIRI Human Rights Data Project.! These indices take a human rights 
perspective and measure on a yearly basis whether a number of internationally recognized 
rights for women are included in law and whether government enforces them. From the three 
indices, WOSOC is the most encompassing measure covering social relations (Bjornskov 
et al., 2009). However, it does not allow one to differentiate between different dimensions of 
social institutions. For example, it is important to distinguish between what happens within 
the family and what happens in public and social life. Furthermore, other shortcomings of 
all three indices are that they also cover outcomes of institutions, and they can only take 
four values from 0 (no rights) to 3 (legally guaranteed and enforced rights) which makes it 
difficult to compare and rank countries as there are many ties in the data. 

In this paper we propose new composite measures that proxy social institutions related 
to gender inequality in non-OECD countries based on variables of the OECD Development 
Centre's Gender, Institutions and Development Database (Morrison and Jütting, 2005; Jüt- 
ting et al., 2008). We aggregate the variables into five subindices that each measure one 
dimension of social institutions related to gender inequality (Family code, Civil liberties, 
Physical integrity, Son preference and Ownership rights). We combine the subindices into 


! Information is available on the webpage of the project http: //ciri.binghamton.edu/. 
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the Social Institutions and Gender Index (SIGI) as a multidimensional measure of the depri- 
vation of women. 

In general, the construction of composite measures requires several decisions, for exam- 
ple about the weighting scheme and the method of aggregation (e.g. Nardo et al., 2005). 
The subindices as one-dimensional measures are built using the method of polychoric prin- 
cipal component analysis to extract the common information of the variables corresponding 
to a subindex. When we combine the subindices to construct the SIGI, we use a reasonable 
methodology to capture the multidimensional deprivation of women caused by social institu- 
tions. The formula of the SIGI is inspired by the Foster-Greer-Thorbecke poverty measures 
(Foster et al., 1984) and offers a new way of aggregating gender inequality in several dimen- 
sions measured by the subindices. It is transparent and easy to understand, it penalizes high 
inequality in each dimension and allows only for partial compensation between dimensions. 

The SIGI and the subindices are useful tools to compare the societal situation of women 
in over 100 non-OECD countries from a new perspective, allowing the identification of prob- 
lematic countries and dimensions of social institutions that deserve attention by policy mak- 
ers and need to be scrutinized in detail. Empirical results show that the SIGI provides addi- 
tional information to that of other well-known gender-related indices. Moreover, regression 
analysis shows that the SIGI is related to indices that measure outcome gender inequality, 
even if one takes into account region, religion and level of economic development. 

This paper is organized as follows. In section 1.2, we describe the OECD Development 
Centre's Gender, Institutions and Development Database. Then, in sections 1.3 and 1.4 
we focus on the construction of the subindices and of the SIGI. In section 1.5, we present 
empirical results by country, interesting regional patterns and a comparison between the SIGI 
and other gender-related measures. Furthermore, using regression analysis we illustrate the 
relevance of the SIGI for explaining outcome gender inequality. The last section concludes 
with a discussion of the strengths and weaknesses of the proposed measures. 


1.3 The OECD GID Database 


As input for the composite measures we use variables from the OECD Development Centre's 
Gender, Institutions and Development (GID) Database (Morrison and Jütting, 2005; Jütting 
et al., 2008). This is a cross-country database covering about 120 countries with more than 20 
variables measuring social institutions related to gender inequality.? These variables proxy 
social institutions through prevalence rates, legal indicators or indicators of social practices. 


The data are available at the web-pages http: //www.wikigender.org and http: //www.oecd. 
org/dev/gender/gid. 
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We assume that the concept social institutions related to gender inequality is multidimen- 
sional. Following previous work done by the OECD (Jútting et al., 2008) we choose twelve 
variables that are assumed to measure each one of four dimensions of social institutions. 

The Family code dimension refers to the private sphere with institutions that influence 
the decision-making power of women in the household. Family code is measured by the 
following four variables. Parental authority measures whether women have the right to be 
the legal guardian of a child during marriage, and whether women have custody rights over 
a child after divorce. Inheritance is based on formal inheritance rights of spouses and chil- 
dren. Early marriage measures the percentage of girls between 15 and 19 years of age who 
are/were ever married. Polygamy measures the acceptance of polygamy in the population. 
Countries where this information is not available are assigned scores based on the legality of 
polygamy.’ 

The public sphere is measured by the Civil liberties dimension that captures the free- 
dom of social participation of women and includes the following two variables. Freedom 
of movement measures the level of restrictions women face in moving freely outside their 
own household. Freedom of dress measures the extent to which women are obliged to follow 
a certain dress code in public, for example being obliged to cover their face or body when 
leaving the house. 

The Physical integrity dimension comprises different indicators on violence against women. 
The variable violence against women indicates the existence of laws against domestic vio- 
lence, sexual assault or rape, and sexual harassment. Female genital mutilation is the per- 
centage of women who have undergone female genital mutilation. Missing women measures 
gender bias in mortality. Countries were coded based on estimates of gender bias in mortal- 
ity for a sample of countries (Klasen and Wink, 2003) and on sex ratios of young people and 
adults. 

The Ownership rights dimension covers the economic sphere of social institutions proxied 
by the access of women to several types of property. Women’s access to land indicates 
whether women are allowed to own land. Women’s access to bank loans measures whether 
women are allowed to access credits. Women’s access to property other than land covers 
mainly access to real property such as houses, but also any other property. 

Concerning the missing women variable in the Physical integrity dimension, it could be 
argued that it reflects another dimension of gender inequality. Missing women is an extreme 
manifestation of son preference under scarce resources. 100 million women are not alive 
3 Acceptance of polygamy in the population might proxy actual practices better than the formal indicator legality 

of polygamy and, moreover, laws might be changed faster than practices. Therefore, the acceptance variable 


is the first choice for the subindex Family code. The reason for using legality when acceptance is missing is 
to increase the number of countries. 
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who should be alive if women were not discriminated against (Sen, 1992; Klasen and Wink, 
2003). The other components of Physical integrity, violence against women and female 
genital mutilation, measure particularly the treatment of women which is not only motivated 
by economic considerations. In the next section, we check with statistical methods if missing 
women measures another dimension as the variables violence against women and female 
genital mutilation. 

These twelve variables are between 0 and 1. The value 0 means no or very low inequality 
and the value 1 indicates high inequality. Three of the variables (early marriage, female 
genital mutilation and violence against women) are continuous. The other indicators measure 
social institutions on an ordinal categorical scale. The chosen variables cover around 120 
non-OECD countries from all regions in the world except North America.* The choice of 
the variables is also guided by the availability of information so that as many countries as 
possible can be ranked by the SIGI. Within our sample 102 countries have information for 


all twelve variables. 


1.3 Construction of the Subindices 


The objective of the subindices is to provide a summary measure for each dimension of so- 
cial institutions related to gender inequality. In every subindex we want to combine variables 
that are assumed to belong to one dimension. The first step is to check the statistical asso- 
ciation between the variables. The second step consists in aggregating the variables with a 
reasonable weighting scheme. 


1.3.1 Measuring the Association between Categorical Variables 


To check the association between variables, and as most of them are ordinal, we use Kendall 
Tau b and Multiple Joint Correspondence Analysis (Greenacre, 2007; Nenadic, 2007). 
Kendall Tau b is a rank correlation coefficient. These measures are useful when the data 
are ordinal and thus the conditions for using Pearson's correlation coefficient are not fulfilled. 
For each variable, the values are ordered and ranked. Then the correspondence between the 


rankings is measured.? 


4The OECD Gender, Institutions and Development Database does not contain variables that capture relevant 


social institutions related to gender inequality in OECD countries. 
>For calculating Kendall Tau, one counts the number of concordant and discordant pairs of two rankings, builds 


the difference and divides this difference by the total number of pairs. A value of 1 means total correspondence 
of rankings, i.e. the rankings are the same. A value of -1 indicates reverse rankings or a negative association 
between rankings. A value of 0 means independence of rankings. Kendall Tau b is a variant of Kendall tau 
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Taking into account tied pairs, the formula for Kendall Tau b is 
C-D 


n(n—1) n(n—1) | 
ү 2-17; 2-7, 


where C is the number of concordant pairs, D is the number of discordant pairs, n is the 


(1.1) 


Th = 


number of observations, ти) is the number of all pairs, 7; is the number of pairs бей on 
the variable x and 7, is the number of pairs tied on the variable y. The notation is taken from 
Agresti (1984). The p-value of tau b under the null hypothesis of no association between 
the variables is computed with the approximation suggested by Kendall (1976), which is 
adequate unless ties are very extensive. As in our case many ties are present, we confirm the 
results with an asymptotically distribution-free confidence interval for Kendall's tau b based 
on the bootstrap method with 1000 replications (Hollander and Wolfe, 1999). 

As a second method to check the association between variables we examine the graphics 
produced by Multiple Joint Correspondence Analysis (MJCA) (Greenacre, 2007; Nenadic, 
2007), after having discretized the three continuous variables. Correspondence Analysis is 
a method for analyzing and representing the structure of contingency tables graphically. We 
use MJCA to find out whether variables seem to measure the same. 

The results for Kendall tau b are reported in Tables 1.1-1.5. A significant positive value of 
Kendall tau b is a sign for a positive association between two variables. This is the case for 
all variables belonging to one dimension, except missing women in the subindex Physical 
integrity. The graphs produced with MJCA are shown in Figures 1.1-1.5.? The results of 
МУСА also confirm that within every dimension all the variables seem to measure the same 
dimension, with the exception of missing women in the dimension Physical integrity. These 
results support the argumentation in section 1.2. 


that corrects for ties, which are frequent in the case of discrete data (Agresti, 1984, chap. 9). We consider 


Kendall Tau b to be the appropriate measure of rank correlation to find out whether our data are related. 
Correspondence Analysis is an exploratory and descriptive method to analyze contingency tables. Instead of 


calculating a correlation coefficient to capture the association of variables, the correspondence of conditional 
and marginal distributions of either rows or columns - also called row or column profiles - is measured using 
a X?-statistic, that captures the distance between them. These row or column profiles then are plotted in a 
low-dimensional space, so that the distances between the points reflect the dissimilarities between the profiles. 
Multiple Joint Correspondence Analysis is an extended procedure for the analysis of more than two variables 
and considers the cross-tabulations of the variables against each other in a so-called Burt matrix but with 
modified diagonal sub-tables. This facilitates to figure out whether variables are associated. This is the case 
when they have similar deviations from homogeneity, and therefore get a similar position in a profile space 


(Greenacre, 2007; Nenadic, 2007). 
"The graphs produced with MICA can be interpreted in the following way. In most cases, one of the axes 


represents whether there is inequality and the other axe represents the extent of inequality. If one connects 
the values of a variable one obtains a graphical pattern. If this is similar to the pattern obtained for another 
variable, then both variables are associated. 
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We decide to use the variable missing women as a fifth subindex called Son preference. 
The artificially higher female mortality is one of the most important and cruel aspects of 
gender inequality and should not be neglected, as over 100 million women that should be 
alive are missing (Sen, 1992; Klasen and Wink, 2003). Missing women is the “starkest 
manifestation of the lack of gender equality" (Duflo, 2005). 


1.3.2 Aggregating Variables to Build a Subindex 


The five subindices Family code, Civil liberties, Son preference, Physical integrity and Own- 
ership rights use the twelve variables as input that were mentioned in the previous section. 
Each subindex combines variables that measure one dimension of social institutions related 
to gender inequality. In the case of Son preference, the subindex takes the value of the vari- 
able missing women. In all other cases, the computation of the subindex values involves two 
steps. 

In the first step, the method of polychoric principal component analysis is used to ex- 
tract the common information of the variables corresponding to a subindex. Principal com- 
ponent analysis (PCA) is a method of dimensionality reduction that is valid for normally 
distributed variables (Jolliffe, 1986). This assumption is violated in this case, as the data 
include variables that are ordinal, and hence the Pearson correlation coefficient is not ap- 
propriate. Following Kolenikov and Angeles (2004, 2009) we use polychoric PCA, which 
relies on polychoric and polyserial correlations. These are estimated with maximum likeli- 
hood, assuming that there are latent normally distributed variables that underly the ordinal 
categorical data. We use the First Principal Component (FPC) as a proxy for the common 
information contained by the variables corresponding to the subindices, measuring each one 
of the dimensions of social institutions related to gender inequality. The FPC is the weighted 
sum of the standardized original variables that captures as much of the variance in the data 
as possible. The standardization of the original variables is done as follows. In the case 
of continuous variables, one subtracts the mean and then divides by the standard deviation. 
In the case of ordinal categorical variables, the standardization uses results of an ordered 
probit model. The weight that each variable gets in these linear combinations is obtained by 
analyzing the correlation structure in the data. The weights are shown in Table 1.6. 

In the second step, the subindex value is obtained rescaling the FPC so that it ranges from 
0 to 1 to ease interpretation. A country with the best possible performance (no inequality) is 
assigned the value 0 and a country with the worst possible performance (highest inequality) 
the value 1. Hence, the subindex values of all countries are between 0 and 1. Using the 


8The proportion of explained variance by the first principal component is 70% for Family code, 93% for Civil 
liberties, 6096 for Physical integrity and 8796 for Ownership rights. 
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score of the FPC the subindex is calculated using the following transformation. Country 
X corresponds to a country of interest, Country Worst corresponds to a country with worst 


possible performance and Country Best is a country with best possible performance. 


FPC(Country X) 
FPC(Country Worst) — FPC(Country Best) 
FPC(Country Best) 
FPC(Country Worst) — FPC(Country Best) 


Subindex(Country X) 


(1.2) 


Every subindex is intended to measure a different dimension of social institutions related 
to gender inequality. To check whether the subindices are empirically non-redundant, so that 
they provide each additional information, we conduct an empirical analysis of the statistical 
association between them. In the case of well-being measures, McGillivray and White (1993) 
suggest using two explicit thresholds to separate redundancy from non-redundancy, that is a 
correlation coefficient of 0.90 and 0.70. Based on this suggestion we use the threshold 0.80. 
In Table 1.7 we present Kendall tau b as a measure of the statistical association between the 
five subindices. In all cases, the subindices are positively correlated, showing that they all 
measure social institutions related to gender inequality. It must be noted, however, that the 
correlation is not always statistically significant. Kendall tau b is lower than 0.80 in all cases, 
which suggests that each subindex measures a distinct aspect of social institutions related to 


gender inequality. 


1.4 The Social Institutions and Gender Index (SIGI) 


With the subindices described in the last section as input, we build a multidimensional com- 
posite index named Social Institutions and Gender Index (SIGI) which reflects the depriva- 
tion of women caused by social institutions related to gender inequality. The proposed index 
is transparent and easy to understand. As in the case of the variables and of the subindices, 
the index value 0 corresponds to no inequality and the value 1 to complete inequality. 

The SIGI is an unweighted average of a non-linear function of the subindices. We use 
equal weights for the subindices, as we see no reason for valuing one of the dimensions more 
or less than the others.? The non-linear function arises because we assume that inequality 
in gender-related social institutions leads to deprivation experienced by the affected women, 
SEmpirically, even in the case of equal weights the ranking produced by a composite index is influenced by the 

different variances of its components. The component that has the highest variance has the largest influence 


on the composite index. In the case of the SIGI the variances of the five components are reasonably close to 
each other, Ownership rights having the largest and Physical integrity having the lowest variance. 
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and that deprivation increases more than proportionally when inequality increases. Thus, 
high inequality is penalized in every dimension. The non-linearity also means that the SIGI 
does not allow for total compensation among subindices, but permits partial compensation. 
Partial compensation implies that high inequality in one dimension, i.e. subindex, can only 
be partially compensated with low inequality on another dimension.!? 


For our specific five subindices, the value of the index the SIGI is then calculated as 


follows. 
1 
SIGI = z (Subindex Family Code)? + : (Subindex Civil Liberties)? 
+ : (Subindex Physical Integrity)? + : (Subindex Son preference)? 
+ : (Subindex Ownership Rights)? (1.3) 


Using a more general notation, the formula for the SIGI /(X), where X is the vector 
containing the values of the subindices x; with i = 1,...,n, is derived from the following 
considerations. For any subindex x;, we interpret the value 0 as the goal of no inequality to 
be achieved in every dimension. We define a deprivation function $ (xj, 0), with $(x;, 0) > 0 
if x; > 0 and $ (x;,0) = 0 if x; = 0 (e.g. Subramanian, 2007). Higher values of x; should lead 
to a penalization in /(.X) that should increase with the distance x; to zero. In our case the 
deprivation function is the square of the distance to 0 so that deprivation increases more than 


proportionally as inequality increases. 


SIGI = I(X) = (800) = ly (s - 9? - 3352 (1.4) 


nii 


The formula is inspired by the Foster-Greer-Thorbecke (FGT) poverty measures (Foster 
et al., 1984). The general FGT formula is defined for y; « z as: 


FGT(Y,a,z) = 15, E, (1.5) 


i=1 


where Y is the vector containing all incomes, y; with i = 1,...,n is the income of individual 


i, z is the poverty line, and @ > 0 is a penalization parameter. 


10Other approaches have been also proposed in the literature, e.g. the non-compensatory approach by Munda 
and Nardo (2005a,b). 
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To compute the SIGI, the value 2 is chosen for & as the square function has the advantage 
of easy interpretation. With o — 2 the transfer principle is satisfied (Foster et al., 1984). In 
the context of poverty this principle means that a transfer from a person below the poverty 
line to a person less poor will raise poverty if the set of poor remains unchanged. In the 
case of the SIGI, the transfer principle means that an increase in inequality in one dimension 
and a decrease of inequality in another dimension of the same magnitude will raise the SIGI, 
assuming that both dimensions had equal values at the beginning. 

Some differences between the SIGI and the FGT measures must be highlighted. In the 
case of the SIGI, we are aggregating across dimensions and not over individuals. Moreover, 
in contrast to the income case, a lower value of x; is preferred, and the normalization achieved 
when dividing by the poverty line z is not necessary as 0 < x; < 1, i= 1,...,и. 

The SIGI fulfills several properties. For a formal presentation of the properties and the 
proofs, see the Appendix to Essay 1. 


e Support and range: The value of the index can be computed for any values of the 


subindices, and it is always between 0 and 1. 


e Anonymity: Neither the name of the country nor the name of the subindex have an 
impact on the value of the index. 


e Unanimity or Pareto Optimality. If a country has values for every subindex that are 
lower than or equal to those of another country, then the index value for the first country 
is lower than or equal to the one for the second country. 


e Monotonicity: If one country has a lower value for the index than a second country, 
and a third country has the same values for the subindices as the first country, except 
for one subindex which is lower, then the third country has a lower index value than 


the second country. 


e Penalization of dispersion: For two countries with the same average value of the 
subindices, the country with the lowest dispersion of the subindices gets a lower value 


for the index. 


e Compensation: Although the SIGI is not conceived for changes over time this prop- 
erty is more intuitively understood in the following way. Assume there are only two 
subindices and that a country has the same level of inequality in the two subindices. If 
the country experiences an increase in inequality by a given amount on one subindex, 
then it can only have the same value of the index as before, if there is a decrease in 
inequality on the other subindex that is higher in absolute value than the increase in 
the first subindex. 
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To highlight the effects of partial compensation as compared to total compensation we 
computed the statistical association between the SIGI and a simple arithmetic average of the 
five subindices that allows for total compensation and compared the country rankings of both 
measures.!! The Pearson correlation coefficient between the SIGI and the simple arithmetic 
average of the five subindices shows a high and statistically significant correlation between 
both measures (Table 1.8). However, when we compare the ranks of the SIGI with those 
obtained using a simple arithmetic average of the five subindices in Table 1.9, we observe 
that there are noticeable differences in the rankings of the 102 included countries. Examples 
are China and Nepal. China ranks in position 55 using the simple average, but worsens 
to place 83 in the SIGI ranking. Nepal has place 84 considering the simple average, and 
improves to rank 65 using the SIGI. For China, this is due to the high value on the subindex 
Son preference, which in the SIGI case cannot be fully compensated with relatively low 
values for the other subindices. For Nepal we observe the opposite case as all subindices 


have values reflecting moderate inequality. 


1.5 Results 


1.5.1 Country Rankings and Regional Patterns 


The subindices are computed for countries that have no missing values on the relevant input 
variables. In the case of the SIGI only countries that have values for every subindex are 
considered. The results for the SIGI and its five subindices are presented in Table 1.10. 
Among the 102 countries considered by the SIGI, Paraguay, Croatia, Kazakhstan, Argentina 
and Costa Rica have the lowest levels of gender inequality related to social institutions. 
Sudan is the country that occupies the last position, followed by Afghanistan, Sierra Leone, 
Mali and Yemen, which means that gender inequality in social institutions is a major problem 
there. 

Rankings according to the subindices are as follows. For Family code 112 countries can 
be ranked. Best performers are China, Jamaica, Croatia, Belarus and Kazakhstan. Worst 
performers are Mali, Chad, Afghanistan, Mozambique and Zambia. In the dimension Civi/ 
liberties 123 countries are ranked. Among them 83 share place 1 in the ranking. Sudan, Saudi 
Arabia, Afghanistan, Yemen and Iran occupy the last five positions of high inequality. 114 
countries can be compared with the subindex PAysical Integrity. Hong Kong, Bangladesh, 
Ile cannot compare ће SIGI with the results of the non-compensatory index as proposed by Munda and 

Nardo (2005a,b). The algorithm used for calculating non-compensatory indices compares pairwise each 


country for each subindex. However, as our dataset includes many countries with equal values on several 
subindices, the numerical algorithm cannot provide a ranking. 
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Chinese Taipei, Ecuador, El Salvador, Paraguay and Philippines are at the top of the ranking 
while Mali, Somalia, Sudan, Egypt and Sierra Leone are at the bottom. In the dimension Son 
preference 88 out of 123 countries rank at the top as they do not have problems with missing 
women. The countries that rank worst are China, Afghanistan, Papua New Guinea, Pakistan, 
India and Bhutan. Finally, 122 countries are ranked with the subindex Ownership rights. 42 
countries share position 1 as they have no inequality in this dimension. On the other hand the 
four worst performing countries are Sudan, Sierra Leone, Chad and the Democratic Republic 
of Congo. 


To find out whether apparent regional patterns in social institutions related to gender in- 
equality are systematic, we divide the countries in quintiles following the scores of the SIGI 
and its subindices (Table 1.11). The first quintile includes countries with lowest inequality, 
and the fifth quintile countries with highest inequality. 


For the SIGI, no country of Europe and Central Asia (ECA) or Latin America and the 
Caribbean (LAC) is found in the two quintiles reflecting social institutions related to high 
gender inequality. In contrast, most countries in South Asia (SA), Sub-Saharan Africa (SSA), 
and Middle East and North Africa (MENA) rank in these two quintiles. It is interesting to 
note that in the most problematic regions two countries rank in the first two quintiles. These 
are Mauritius (SSA) and Tunisia (MENA). East Asia and Pacific (EAP) has countries in all 
five quintiles with Philippines, Thailand, Hong Kong and Singapore in the first quintile and 
China in the fifth quintile. 


Going on with the subindices the patterns are similar to the one of the SIGI. As more in- 
formation is available for the subindices, the number of countries covered by every subindex 
is different and higher than for the SIGI. In the following some interesting facts are high- 
lighted, especially countries whose scores are different than the average in the region. 


e Family code: No country in ECA, LAC or EAP shows high inequality. SA, MENA and 
SSA remain problematic with countries with social institutions related to high gender 
inequality. Exceptions are Bhutan in SA, Mauritius in SSA, and Tunisia and Israel in 
MENA. 


e Civil liberties: Only three groups of countries using the quintile analysis can be gen- 
erated with the first group including the first three quintiles. In SSA over one-half of 
the countries are now in the first group. Also in MENA there are some countries with 
good scores (Israel, Morocco and Tunisia). No country in SA is found in the first three 


quintiles of low and moderate inequality. 
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e Physical integrity: Most problematic regions are SSA and MENA. Exceptions in these 
regions are Botswana, Mauritius, South Africa and Tanzania (SSA), and Morocco and 
Tunisia (MENA). 


e Son preference: Again only three groups of countries can be built by quintile analysis, 
with the first group including the first three quintiles. As in the case of Civil liberties 
most of the countries in SSA do not show problems. Missing women is mainly an issue 
in SA and MENA. But in both regions there are countries that rank in the first group. 
These are Sri Lanka in SA, and Israel, Lebanon and Occupied Palestinian Territory in 
MENA. 


e Ownership rights: Most problematic regions are SA, SSA and MENA. Nevertheless, 
there are cases in these regions that rank in the first quintile. These are Egypt, Israel, 
Kuwait and Tunisia (MENA), Bhutan (SA), and Eritrea and Mauritius (SSA). 


1.5.2 Simple Correlation with other Gender-related Indices 


The SIGI is an important measure to understand gender inequality as it measures institu- 
tions that influence the basic functioning of society and explain gender inequality in out- 
comes. From this perspective, the SIGI has an added value to other gender-related measures 
irrespective from an empirical redundancy perspective, i.e. whether it provides additional 
information as compared to other measures. 

Nevertheless, one can check whether the index is empirically redundant with an analysis 
of the statistical association between the SIGI and other well-known gender-related indices. 
Relying on McGillivray and White (1993) we use a correlation coefficient of 0.80 in absolute 
value as the threshold to separate redundancy from non-redundancy. 

We calculated Pearson correlation coefficient and Kendall tau b as a measure of rank 
correlation between the SIGI and each of the following indices: the Gender-related Devel- 
opment Index (GDI) and the Gender Empowerment Measure (GEM) from United Nations 
Development Programme (2006), the Global Gender Gap Index (GGG) from Hausmann 
et al. (2007) and the Women's Social Rights Index (WOSOC).!? As the GDI and the GEM 
have been criticized in the literature (e.g. Klasen, 2006; Schüler, 2006), we also do the anal- 
ysis for two alternative measures, the Gender Gap Index Capped (GGT) and a revised Gender 
Empowerment Measure (GEM2) based on income shares proposed by Klasen and Schüler 
(2009).!3 For all the indices considered both measures of statistical association are lower 


Data obtained from http: //ciri.binghamton.edu/. 
13 The Gender Gap Index Capped (GGI) is a geometric mean of the ratios of female to male achievements in the 


dimensions health, education and labor force participation. Capped means that every component is capped 
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than 0.80 in absolute value and statistically significant. We conclude that the SIGI is related 
to these gender measures but is non-redundant. These results as well as the comparison of 
the country rankings of the SIGI and these other measures can be found in Tables 1.12 and 
1.13, 


1.5.3 Regression Analysis 


The SIGI is aimed to measure the institutional basis of gender inequality. To explore whether 
the SIGI is associated with gender inequality in outcomes we use linear regressions with two 
well-known measures as dependent variables and the SIGI as regressor. The first is the 
Global Gender Gap Index (GGG) that captures gaps in outcome variables related to basic 
rights such as health, economic participation and political empowerment. The second mea- 
sure is the ratio of the Gender-Related Development Index (GDI) to the Human Development 
Index (HDI) as composite measure of gender inequality in the dimensions health, education 
and income.!^ In both regressions we control for the level of economic development using 
the log of per capita GDP in constant prices (US$, PPP, base year: 2005) (World Bank, 
2008); for religion using a Muslim majority and a Christian majority dummy, the left-out 
category being countries that have neither a majority of Muslim nor a majority of Christian 
population (Central Intelligence Agency, 2009); and for geography and other unexplained 
heterogeneity that might go together with region using region dummies, the left-out category 
being Sub-Saharan Africa. As the number of observations is lower than 100, we use HC3 
robust standard errors proposed by Davidson and MacKinnon (1993) to account for possible 
heteroscedasticity in our data. 

The regression results are shown in Table 1.14. When GGG is used as the dependent 
variable, 73 countries are included in the sample and the adjusted coefficient of determination 
R? is 0.62. The SIGI is negatively associated with GGG and significant at the 196 level. In 
the second regression the ratio of GDI to HDI is the dependent variable. The sample consists 
of 79 countries and the adjusted R? is 0.44. The SIGI is again negatively associated with the 


at one before calculating the geometric mean. This is necessary as a better relative performance of women, 
e.g. in the dimension health can be due to a risky behavior of men that should not be rewarded. GGI can 
be more directly interpreted as a measure of gender inequality while the GDI measures human development 
penalizing gender inequality. The GEM has three components, political representation, representation in 
senior positions in the economy, and power over economic resources. The most problematic component 
is power over economic resources proxied by earned incomes. This component measures female and male 
earned incomes using income levels adjusted by gender gaps but not the gender gaps themselves. The revised 


version GEM2 uses income shares of males and females. 
14 As the GDI is not a measure of gender inequality, UNDP recommends using the ratio of GDI to HDI as a 


proxy of gender inequality (http: //hdr.undp.org/en/statistics/indices/gdi gem/). 
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response variable and this association is statistically significant at the 1% level. The results 
suggest that gender inequality in well-being and empowerment is strongly associated with 
social institutions that shape gender roles. 

Even if we include control variables in the regressions we cannot rule out omitted variable 
bias, but as we consider that social institutions related to gender inequality are relatively 
stable and long-lasting, we consider that endogeneity does not pose a major problem. To 
check that our findings are not driven by observations that have large residuals and/or high 
leverage, we also run robust regressions obtaining similar results. !> 


1.6 Conclusion 


In this paper we present composite indices that offer a new way to approach gender inequal- 
ity that has been neglected in the literature and by other gender measures that focus mainly 
on well-being and agency. Instead of measuring gender inequality in education, health, eco- 
nomic or political participation and other dimensions, the proposed measures proxy the un- 
derlying social institutions that are mirrored by societal practices and legal norms that might 
produce inequalities between women and men in developing countries. 

Based on 12 variables of the OECD Gender, Institutions and Development (GID) Da- 
tabase (Morrison and Jütting, 2005; Jütting et al., 2008) we construct five subindices each 
capturing one dimension of social institutions related to gender inequality: Family code, 
Civil liberties, Physical integrity, Son preference and Ownership rights. The Social Insti- 
tutions and Gender Index (SIGI) combines the subindices into a multidimensional index of 
deprivation of women caused by social institutions related to gender inequality. With these 
measures over 100 developing countries can be compared and ranked. 

When constructing composite indices one is always confronted with decisions and trade- 
offs concerning for example the choice and treatment of the variables included, the weight- 
ing scheme and the aggregation method. We try to be transparent in our choices. As the 
subindices are intended each to proxy one dimension of social institutions, we use the method 
of polychoric PCA to extract the common element of the included variables (Kolenikov and 
Angeles, 2009). The methodology for constructing the multidimensional SIGI is based on 
the assumption that in each dimension deprivation of women increases more than propor- 
tionally when inequality increases, and that each dimension should be weighted equally. 
The formula of the SIGI is inspired by the FGT poverty measures (Foster et al., 1984) and 
I5Results are available upon request. The type of robust regression we perform uses iteratively reweighted 

least squares and is described in Hamilton (1992). A regression is run with ordinary least squares, then case 


weights based on absolute residuals are calculated, and a new regression is performed using these weights. 
The iterations continue as long as the maximum change in weights remains above a specified value. 
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has the advantage of penalizing high inequality in each dimension and only allowing for par- 
tial compensation among the five dimensions. We consider that the formula to compute the 
SIGI is easy to understand and to communicate. 


However, some limitations of the subindices and the SIGI must be noted. First, a com- 
posite index depends on the quality of the data used as input. Social institutions related to 
gender inequality are hard to measure and the work accomplished by the OECD Develop- 
ment Centre in building the GID database is an important step forward. It is worthwhile to 
continue this endeavor and invest more resources in the measurement of social institutions 
related to gender inequality. This includes data coverage, coding schemes and the refinement 
of indicators. It would be useful to exploit data available, for example from Demographic 
and Health Surveys (DHS)!6, that specifically address the perception that women have of 
violence against women, and to finance surveys in countries where data is not available. 


Secondly, by aggregating variables and subindices, one inevitably loses some information. 
Figures and rankings according to the SIGI and the subindices should not substitute a careful 
investigation of the variables from the database. Furthermore, to understand the situation in 
a given country additional qualitative information could be valuable. 


Thirdly, one should keep in mind that OECD countries are not included in our sample, 
as social institutions related to gender inequality in these countries are not well captured by 
the 12 variables used for building the composite measures. This does not mean that this 
phenomenon is not relevant for OECD countries, but that further research is required to 


develop appropriate measures. 


Nonetheless, the SIGI and its subindices offer a new perspective to understand gender 
inequality. Empirical results show that the SIGI is statistically non-redundant and adds new 
information to other well-known gender-related measures. The SIGI and the five subindices 
can help policy-makers to detect in which developing countries and in which dimensions of 
social institutions problems need to be addressed. For example, according to the SIGI scores, 
regions with highest inequality are South Asia, Sub-Saharan Africa, and the Middle East 
and North Africa. The composite measures can be valuable instruments to generate public 
discussion. Moreover, the SIGI and its subindices have the potential to influence current 
development thinking as they highlight social institutions that affect overall development. As 
is shown in the literature (e.g. Klasen, 2002; Klasen and Lamanna, 2009), gender inequality 
in education negatively affects overall development. Economic research investigating these 
outcome inequality should consider social institutions related to gender inequality as possible 
explanatory factors. Results from regression analysis show that the SIGI is related to gender 


16Information is available on the webpage http: / /www.measuredhs .com/. 
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inequality in well-being and empowerment, even after controlling for region, religion and 


the level of economic development. 


1.7 Tables 
Table 1.1: Kendall tau b: Dimension Family Code 
Earmarr Polyg Parauth Inher 
Earmarr Kendall tau b 1 
Number of obs. 112 
p-Value 
Polyg Kendall tau b 0.30 1 
Number of obs. 112 112 
p-Value 0.0001 
Parauth Kendall tau b 0.29 0.48 1 
Number of obs. 112 112 112 
p-Value 0.0001 0.0000 
Inher Kendall tau b 0.23 0.60 0.57 1 
Number of obs. 112 112 112 112 
p-Value 0.0020 0.0000 0.0000 


Earmarr: Early marriage, Polyg: Polygamy, Parauth: Parental authority, Inher: inheritance. See 
section 1.2 for details. The p-values correspond to the null hypothesis that the two variables are 
independent. 


Table 1.2: Kendall tau b: Dimension Civil Liberties 


Obliveil 

Freemov Kendall tau b 0.61 
Number of obs. 123 

p- Value 0.0000 


Freemov: Freedom of movement, Obliveil: Freedom of dress. See section 1.2 for details. The p-value 


correspond to the null hypothesis that two variables are independent. 
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Table 1.3: Kendall tau b: Dimension Physical Integrity with Missing Women 


Femmut Vio Misswom 
Femmut Kendall tau b 1 
Number of obs. 114 
p-Value 
Vio Kendall tau b 0.16 1 
Number of obs. 114 114 
p-Value 0.0382 
Misswom Kendall tau b -0.10 0.11 1 
Number of obs. 114 114 114 
p-Value 0.2160 0.1634 


Femmut: Female Genital Mutilation, Vio: Violence against women, Misswom: Missing women. See 
section 1.2 for details. The p-values correspond to the null hypothesis that the two variables are 
independent. 


Table 1.4: Kendall tau b: Dimension Physical Integrity without Missing Women 


Vio 

Femmut Kendall tau b 0.16 
Number of obs. 114 

p-Value 0.0382 


Femmut: Female Genital Mutilation, Vio: Violence against women. See section 1.2 for details. The 


p-value correspond to the null hypothesis that two variables are independent. 
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Table 1.5: Kendall tau b: Dimension Ownership Rights 


Womland Womloans Womprop 


Womland  . Kendall tau b 1 
Number of obs. 122 
p-Value 
Womloans Kendall tau b 0.59 1 
Number of obs. 122 122 
p-Value 0.0000 
Womprop Kendall tau b 0.64 0.60 1 
Number of obs. 122 122 122 


Womland: Women’s access to land, Womloans: Women’s access to loans, Womprop: Women’s 
access to property other than land. See section 1.2 for details. The p-values correspond to the null 
hypothesis that the two variables are independent. 


Table 1.6: Weights from Polychoric PCA 


Weights 
Family code 
Parental authority 0.5212 
Inheritance 0.5404 
Early marriage 0.3877 
Polygamy 0.5348 
Civil liberties 
Freedom of movement 0.7071 
Freedom of dress 0.7071 
Physical integrity 
Female genital mutilation 0.7071 
Violence against women 0.7071 
Ownership rights 
Women’s access to land 0.5811 
Women's access to loans 0.5665 


Women's access to other property 0.5843 


1.7. TABLES 27 


Table 1.7: Kendall tau b between Subindices 


Family Civil Physical Son Ownership 
code liberties integrity preference rights 
Family code Kendall tau b 1 
Number obs. 112 
Civil liberties Kendall tau b 0.38 1 
Number obs. 112 123 
p-value 0.0000 
Physical integrity Kendall tau b 0.44 0.26 1 
Number obs. 103 113 114 
p-value 0.0000 0.0005 
Son preference Kendall tau b 0.16 0.43 0.03 1 
Number obs. 112 122 114 123 
p-value 0.0317 0.0000 0.7220 
Ownership rights Kendall tau b 0.55 0.30 0.39 0.10 1 
Number obs. 111 121 112 121 122 
p-value 0.0000 0.0001 0.0000 0.181 


Table 1.8: Pearson Correlation Coefficient (p) between the SIGI and the Simple Average of 
the Five Subindices 


p 0.96 
Number obs. 102 
p-value 0.0000 
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Table 1.9: Comparison of the SIGI and the Simple Average of the 


Subindices 
SIGI Simple Aver. Simple Aver. Rank 

Paraguay 1 0.00248 0.03129 

Croatia 2 0.00333 0.02738 -1 
Kazakhstan 3 0.00348 0.03143 0 
Argentina 4 0.00379 0.03548 0 
Costa Rica 5 0.00709 0.05021 0 
Russian Federation 6 0.00725 0.05381 5 
Philippines 7 0.00788 0.06032 8 
El Salvador 8 0.00826 0.06479 8 
Ecuador 9 0.00915 0.07005 9 
Ukraine 10 0.00969 0.05138 -4 
Mauritius 11 0.00976 7 0.05219 4 
Moldova 12 0.00980 0.05267 4 
Bolivia 13 0.00983 0.05300 -4 
Uruguay 14 0.00992 0.05381 4 
Venezuela, RB 15 0.01043 0.05786 -2 
Thailand 16 0.01068 0.06530 1 
Peru 17 0.01213 0.05866 -3 
Colombia 18 0.01273 0.08289 6 
Belarus 19 0.01339 0.05638 -7 
Hong Kong, China 20 0.01465 0.07076 -1 
Singapore 21 0.01526 0.07146 -1 
Cuba 22 0.01603 0.07502 0 
Macedonia, FYR 23 0.01787 0.08186 0 
Brazil 24 0.01880 0.07353 -3 
Tunisia 25 0.01906 0.10123 4 
Chile 26 0.01951 0.10653 5 
Cambodia 27 0.02202 0.08862 0 
Nicaragua 28 0.02251 0.11175 4 
Trinidad and Tobago 29 0.02288 0.11434 5 
Kyrgyz Republic 30 0.02924 0.12716 6 
Viet Nam 31 0.03006 0.08375 -6 
Armenia 32 0.03012 0.08456 -6 
Georgia 33 0.03069 0.09024 -5 
Guatemala 34 0.03193 0.12440 1 
Tajikistan 35 0.03262 0.13772 2 
Honduras 36 0.03316 0.11225 -3 
Azerbaijan 37 0.03395 0.10590 -7 
Lao PDR 38 0.03577 0.14164 


Continued on next page 
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Table 1.9 — continued from previous page 


SIGI Simple Aver. Simple Aver. Rank 
Country Ranking "Value | Ranking Value minus SIGI rank 
Mongolia 0.03912 0.16806 4 
Dominican Republic 0.03984 0.14402 
Myanmar 0.04629 0.15532 1 
Jamaica 0.04843 0.13998 4 
Morocco 0.05344 0.19732 2 
Fiji 0.05450 0.15512 -3 
Sri Lanka 0.05914 0.21069 2 
Madagascar 0.06958 0.19385 -2 
Namibia 0.07502 0.24188 2 
Botswana 0.08102 0.20277 -2 
South Africa 0.08677 0.25654 4 
Burundi 0.10691 0.24881 
Albania 0.10720 0.27159 
Senegal 0.11041 0.24241 -2 
Tanzania 0.11244 0.24452 -2 
Ghana 0.11269 0.25684 
Indonesia 0.12776 0.26929 
Eritrea 0.13645 0.22890 -8 
Kenya 0.13704 0.26730 -1 
Cote d'Ivoire 0.13712 0.28623 1 
Syrian Arab Republic 0.13811 0.36194 15 
Malawi 0.14323 0.33096 
Mauritania 0.14970 0.33362 
Swaziland 0.15655 0.34562 
Burkina Faso 0.16161 0.30306 -3 
Bhutan 0.16251 0.31967 -1 
Nepal 0.16723 0.39738 19 
Rwanda 0.16859 0.30592 -5 
Niger 0.17559 0.35373 5 
Equatorial Guinea 0.17597 0.36767 
Gambia, The 0.17830 0.31775 -7 
Central African Republic 0.18440 0.33231 -3 
Kuwait 0.18602 0.37231 8 
Zimbabwe 0.18700 0.36859 
Uganda 0.18718 0.37357 7 
Benin 0.18899 0.33197 -8 
Algeria 0.19024 0.41232 12 
Bahrain 0.19655 0.43106 13 
Mozambique 0.19954 0.38088 5 
Togo 0.20252 0.34352 -9 


Continued on next page 
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Table 1.9 — continued from previous page 


SIGI Simple Aver. Simple Aver. Rank 

Country Ranking Value | Ranking Value minus SIGI rank 
Congo, Dem. Rep. 0.20448 0.32770 -15 
Papua New Guinea 0.20936 0.38431 3 
Cameroon 0.21651 0.40132 4 
Egypt, Arab Rep. 0.21766 0.37798 -1 
China 0.21786 0.26056 -28 
Gabon 0.21892 0,40386 2 
Zambia 0.21939 0.35261 -14 
Nigeria 0.21991 0.45401 6 
Liberia 0.22651 0.36290 -12 
Guinea 0.22803 0.36782 -11 
Ethiopia 0.23325 0.35590 -16 
Bangladesh 0.24465 0.44911 1 
Libya 0.26019 0.50580 3 
United Arab Emirates 0.26575 0.50826 4 
Iraq 0.27524 0.52298 4 
Pakistan 0.28324 0.50621 1 
Iran, Islamic Rep. 0.30436 0.52525 3 
India 0.31811 0.52951 3 
Chad 0.32258 0.47332 E 
Yemen 0.32705 0.55679 2 
Mali 0.33949 0.42266 -11 
Sierra Leone 0.34245 0.44886 -10 
Afghanistan 0.58230 0.74613 0 
Sudan 0.67781 0.80051 0 


The data are sorted according to the value of the SIGI. 


Table 1.10: Ranking according to the SIGI and the Five Subindices 


Paraguay 
Croatia 


Kazakhstan 
Argentina 

Costa Rica 
Russian Federation 
Philippines 

El Salvador 
Ecuador 

Ukraine 
Mauritius 
Moldova 

Bolivia 

Uruguay 
Venezuela, RB 
Thailand 

Peru 

Colombia 
Belarus 

Hong Kong, China 
Singapore 

Cuba 
Macedonia, FYR 
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0.08106 
0.14028 
0.04053 
0.06485 
0.08917 
0.04053 
0.04458 
0.04701 
0.04864 
0.05269 
0.07295 
0.15649 
0.05269 
0.07295 
0.02432 
0.10380 
0.09975 
0.11754 
0.15169 
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0.08757 
0.12878 
0.12878 
0.12878 
0.16999 
0.12878 
0.08757 
0.08757 
0.08757 
0.21635 
0.21635 
0.21635 
0.21635 
0.21635 
0.21635 
0.16999 
0.24059 
0.16999 
0.25756 

0 
0.25756 
0.25756 
0.25756 


SIGI Family code Civil liberties Physical integrity Son preference Ownership rights 
Country Ranking Value | Ranking Value | Ranking Value | Ranking Value | Ranking Value | Ranking Value 
19 3 1 0 


0.17351 
0.17151 
0.17351 


1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
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Table 1.10 — continued from previous page 


SIGI Family code Civil liberties Physical integrity Son preference Ownership rights 

Brazil 24 0.01880 19 0.06890 48 0.29877 1 0 1 0 
0.01906 0.12738 9 0.12878 89 0.25 1 0 
0.01951 0.13909 0.21635 0.17723 
0.02202 0.14433 0.29877 0 
0.02251 0.12970 0.25756 0.17151 
0.02288 0.15169 0.16999 0 
0.02924 0.15980 0.29877 0.17723 
0.03006 0.03242 0.38634 
0.03012 0.03648 0.38634 
0.03069 0.06485 0.38634 
0.03193 0.10538 0.34513 
0.03262 0.25955 0.25756 
0.03316 0.21610 0.34513 
0.03395 0.14314 0.38634 
0.03577 0.32034 0.21635 
0.03912 0.12001 0.29877 
0.03984 0.11754 0.25756 
0.04629 0.14028 0.38634 
0.04843 0.00405 0.34513 
0.05344 0.26279 0.12878 
0.05450 0.04053 0.38634 
0.05914 0.23404 15 0.16999 
0.06958 0.41138 0.38634 
0.07502 0.35307 0.25756 


Tunisia 
Chile 
Cambodia 
Nicaragua 
Trinidad & Tobago 
Kyrgyz Republic 
Viet Nam 
Armenia 
Georgia 


Guatemala 
Tajikistan 
Honduras 
Azerbaijan 
Lao PDR 
Mongolia 


0.17151 
0.17151 
0 
0 
0.17151 
0.17151 
0.34502 
0 
0.35074 
0.34502 
0.34874 
66 0.34874 
0.17151 
0.34874 
Continued on next page 
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Dominican Republic 


Myanmar 


Jamaica 


Morocco 
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Fiji 
Sri Lanka 

Madagascar 
Namibia 
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Table 1.10 — continued from previous page 


SIGI Family code Civil liberties Physical integrity Son preference Ownership rights 
Ranking Value | Ranking Value | Ranking Value Ranking Value 
Botswana 0.08102 0.32163 15 0.16999 1 0 0.52225 
South Africa 0.08677 0.42326 0.21635 
Burundi 0.10691 0.33545 0.38634 
Albania 0.10720 0.12288 0.38634 
Senegal 0.11041 0.60250 0.26455 
Tanzania 0.11244 0.49886 0.20151 
Ghana 0.11269 0.36621 0.39575 
Indonesia 0.12776 0.35405 0.39362 
Eritrea 0.13645 0.45538 0.68910 
Kenya 0.13704 0.37027 0.28152 
Cote d'Ivoire 0.13712 0.49012 0.43455 
Syrian Arab Republic 0.13811 0.40269 0.30069 0.25756 
Malawi 0.14323 0.36087 0.29808 0.47362 
Mauritania 0.14970 0.42056 0.30069 0.60183 
Swaziland 0.15655 0.52144 0.29808 0.38634 
Burkina Faso 0.16161 0.53939 0 0.63092 
Bhutan 0.16251 0.20513 0.29808 0.34513 0 
Nepal 0.16723 0.36779 0.29808 0.29877 0.52225 
Rwanda 0.16859 0.32974 0 0.51512 0.68473 
Niger 0.17559 0.64882 0 0.52482 ; 0.34502 
Equatorial Guinea 0.17597 0.50291 0.29808 0.51512 0.52225 
Gambia, The 0.17830 0.64303 0 0.59698 0 0.34874 
Central African Rep. 0.18440 0.55902 0 0.58029 0.52225 
Kuwait 0.18602 0.50523 103 0.59876 0.25756 Ў 0 


о 


0.34502 
0.52225 
0.34874 
0.34502 
0.52225 
0.52225 

0 

0 
0.68473 
0.50650 
0.34874 
0.52225 
0.34502 
0.52225 
0.34502 
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Table 1.10 — continued from previous page 


SIGI Family code Civil liberties Physical integrity Son preference Ownership rights 
Ranking Value | Ranking Value | Ranking Value Ranking Value 
Zimbabwe 0.18700 0.49075 84 0.29808 59 0.36937 1 0 0.68473 
Uganda 0.18718 0.63697 84 0.29808 0.41058 0.52225 
Benin 0.18899 0.50633 1 0 0.46877 0.68473 
Algeria 0.19024 0.40501 0.59876 0.38634 i 0.17151 
Bahrain 0.19655 0.32147 0.59876 0.38634 ; 0.34874 
Mozambique 0.19954 0.69776 0.29808 0.38634 0.52225 
Togo 0.20252 0.58833 0.44452 0.68473 
Congo, Dem. Rep. 0.20448 0.39038 0.41058 0.83752 
Papua New Guinea 0.20936 0.27697 0.38634 : 0.50825 
Cameroon 0.21651 0.54344 0.29808 0.48332 0.68175 
Egypt, Arab Rep. 0.21766 0.26647 0.30069 0.82273 ; 0 
China 0.21786 0.00405 0 0.29877 0 
Gabon 0.21892 0.68387 0.29808 0.51512 0.52225 
Zambia 0.21939 0.69197 0 0.38634 0.68473 
Nigeria 0.21991 0.42056 0.59876 0.47847 | 0.52225 
Liberia 0.22651 0.53470 0.75756 0.52225 
Guinea 0.22803 0.67140 0.64546 0.52225 
Ethiopia 0.23325 0.32726 0.77424 0.67801 
Bangladesh 0.24465 0.58334 0.59876 0.04121 ; 0.52225 
Libya 0.26019 0.39285 0.59876 0.51512 : 0.52225 
Unit. Arab Emirates 0.26575 0.56197 0.59876 0.53180 : 0.34874 
Iraq 0.27524 0.47391 0.59876 0.51997 ; 0.52225 
Pakistan 0.28324 0.37821 0.59876 0.28180 0.52225 
Iran, Islamic Rep. 0.30436 0.55792 0.78099 0.51512 0.52225 
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Table 1.10 — continued from previous page 


SIGI Family code Civil liberties Physical integrity Son preterence Ownership rights 
Ranking Value | Ranking Value | Ranking Value Ranking Value 
India 0.31811 0.60655 103 0.59876 0.16999 0.52225 
Chad 0.32258 0.79330 98 0.30069 0.43212 0.84049 
Yemen 0.32705 0.59439 119 0.78099 0.38634 E 0.52225 
Mali 0.33949 0.79735 0.97091 0.34502 
Sierra Leone 0.34245 0.60159 0.79849 0.84424 
Afghanistan 0.58230 0.71598 . 0.51512 0.68175 
Sudan 0.67781 0.67981 0.82273 ] 1 
Angola NA 0.54344 NA i 0.52225 
Bosnia & Herzegovina NA NA 0.25756 0 
Chinese Taipei NA NA 0.08757 І 0 
Congo, Rep. NA 0.62450 NA 0.52225 
Guinea-Bissau NA NA 0.75756 0.68473 
Haiti NA 0.37837 0.34513 NA 
Israel NA 0.22712 NA 0 
Jordan NA 0.51739 0.59876 NA 0.52225 
Korea, Dem. Rep. NA NA 0.29808 0.51512 0 
Lebanon NA NA 0.59876 0.38634 0.17351 
Lesotho NA 0.57149 0.29808 NA 0.52225 
Malaysia NA 0.32163 0.59876 NA 0 
Occup. Palestinian Terr. NA 0.48607 0.59876 NA 0.34874 
Oman 0.45364 0.29808 NA 0.34874 
Panama NA 0.11181 0 
Puerto Rico NA 0.21635 NA 
Saudi Arabia 74 0.45364 NA 0.52225 
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Table 1.10 — continued from previous page 


SIGI Family code Civil liberties Physical integrity 
Country Ranking Value | Ranking Value | Ranking Value | Ranking Value 
NA NA 1 0 NA 


Serbia & Montenegro 


Somalia NA 103 0.59876 0.84213 


Timor-Leste NA 1 0.42755 
Turkmenistan NA 1 0.38634 
Uzbekistan NA 1 0.38634 


Son preference Ownership rights 
Ranking Value | Ranking Value 
NA 43 


0.17151 
0.68473 
0.52225 
0.52225 
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Table 1.11: Regional Pattern of the Composite Index and Subindices 


ECA LAC EAP SA 


SSA MENA | Total 
SIGI 


Quintile 1 21 
Quintile 2 20 
Quintile 3 21 
Quintile 4 20 
Quintile 5 20 
Total 102 
Family Code 

Quintile 1 7 11 4 0 1 0 23 
Quintile 2 5 8 6 1 0 2 22 
Quintile 3 1 1 4 3 9 5 23 
Quintile 4 0 0 0 0 15 7 22 
Quintile 5 0 0 0 3 16 3 22 
Total 13 20 14 7 41 17| 112 
Civil Liberties 

Quintile 1, 2, 3 17 22 14 0 27 3 83 
Quintile 4 0 0 1 3 12 3 19 
Quintile 5 0 0 2 4 3 12 21 
Total 17 22 17 7 42 18 | 123 
Physical Integrity 

Quintile 1 5 13 5 3 4 2 32 
Quintile 2 4 4 1 0 3 2 14 
Quintile 3 7 5 7 3 6 4 32 
Quintile 4 0 0 3 1 13 2 19 
Quintile 5 0 0 0 0 14 3 17 
Total 16 22 16 7 40 13 | 114 
Missing Women 

Quintile 1, 2, 3 15 21 10 1 38 3 88 
Quintile 4 0 1 4 0 4 3 12 
Quintile 5 1 0 3 6 1 12 23 
Total 16 22 17 7 43 18| 123 
Ownership Rights 

Quintile 1 12 12 11 1 2 4 42 
Quintile 2 2 4 2 0 1 1 10 
Quintile 3 2 3 2 1 8 7 23 
Quintile 4 1 1 2 4 18 6 32 
Quintile 5 0 0 0 1 14 0 15 
Total 17 20 17 7 43 18 | 122 


ЕСА: Europe and Central Asia, LAC: Latin America and ће Caribbean, ЕАР: East Asia and Pacific, SA: 
South Asia, SSA: Sub-Saharan Africa, MENA: Middle East and North Africa. 
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Table 1.12: Statistical Association between the SIGI and other Gender-related Measures 


GDI Kendall taub -0.50 Pearson Corr. Coeff. -0.59 
Number obs. 79 p-value 0.0000 p-value 0.0000 
GGI (capped) Kendall taub -0.51 Pearson Соп. Coeff. -0.72 
Number obs. 85 p-value 0.0000 p-value 0.0000 
GEM Kendall taub -0.43 Pearson Corr. Coeff. -0.70 
Number obs. 33 p-value 0.0005 p-value 0.0000 
GEM (revised) Kendall taub -0.44 Pearson Corr. Coeff. -0.75 
Number obs. 33 p-value 0.0003 p-value 0.0000 
GGG Kendall taub -0.47 Pearson Corr. Coeff. -0.73 
Number obs. 73 p-value 0.0000 p-value 0.0000 
WOSOC Kendall taub -0.49 Pearson Corr. Coeff. -0.53 
Number obs. 99 p-value 0.0000 p-value 0.0000 


Data for the Gender-related development Index (GDI) and the Gender Empowerment Measure (GEM) are 
from United Nations Development Programme (2006) and are based on the year 2004. The Gender Gap 
Index (СОТ) capped and the revised Gender Empowerment Measure (GEM revised) are taken from Klasen and 
Schüler (2009) based on the year 2004. Data for the Global Gender Gap Index (GGG) are from Hausmann 
et al. (2007). The Women's Social Rights Index (WOSOC) data correspond to the year 2007 and are obtained 
from http://ciri.binghamton.edu/. The p-values correspond to the null hypothesis that the SIGI 
and the corresponding measure are independent. 
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Table 1.13: Comparison of Ranks: the SIGI and other Gender-related 


Indices 


Country 


Paraguay 
Croatia 


Kazakhstan 
Argentina 

Costa Rica 
Russian Federation 
Philippines 

El Salvador 
Ecuador 
Ukraine 
Mauritius 
Moldova 

Bolivia 

Uruguay 
Venezuela, RB 
Thailand 

Peru 

Colombia 
Belarus 

Hong Kong, China 
Singapore 

Cuba 
Macedonia, FYR 
Brazil 

Tunisia 

Chile 

Cambodia 
Nicaragua 


Trinidad and Tobago 


Kyrgyz Republic 
Viet Nam 
Armenia 
Georgia 
Guatemala 
Tajikistan 
Honduras 
Azerbaijan 


SIGI 


O бо м © tA & WH m 


GDI GGI GEM 


(capped) 

6 16 
18 1 
2 21 
7 40 
10 6 
22 30 
29 35 
19 7 
12 46 
35 24 
5 17 
17 23 
16 8 
23 24 
15 11 
11 3 
37 

13 32 
14 20 
26 72 
3 44 
45 10 
37 56 
9 33 
34 11 
31 2 
20 4 
39 64 
40 19 
38 36 
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16 
28 


24 


GEM GGG 
(revised) 

32 

7 3 

10 

3 11 

2 8 

22 18 

8 1 

14 20 

11 17 

23 25 

44 

15 41 

17 39 

13 24 

18 22 

6 37 

16 7 

6 

11 38 

5 

9 13 

19 36 

55 

20 45 

26 32 

49 

5 19 

33 

15 

34 

24 30 

58 

40 

10 31 


39 


1. THE SOCIAL INSTITUTIONS AND GENDER INDEX (SIGI) 


Table 1.13 — continued from previous page 


Country SIGI GDI GGI GEM GEM GGG WOSOC 
(capped) (revised) 

Lao PDR 38 47 45 
Mongolia 39 36 27 25 25 27 
Dominican Republic 40 25 38 29 19 
Myanmar 41 14 64 
Jamaica 42 30 18 14 3 
Morocco 43 19 
Fiji 44 3 
Sri Lanka 45 24 51 29 28 2 19 
Madagascar 46 53 15 48 19 
Namibia 47 43 33 5 4 9 19 
Botswana 48 46 59 18 21 23 64 
South Africa 49 41 42 4 19 
Burundi 50 72 24 64 
Albania 51 19 
Senegal 52 64 
Tanzania 53 66 27 7 1 12 19 
Ghana 54 48 27 28 19 
Indonesia 55 32 39 42 19 
Eritrea 56 19 
Kenya 57 57 42 43 64 
Cote d'Ivoire 58 68 80 64 
Syrian Arab Republic 59 33 63 56 64 
Malawi 60 70 41 46 19 
Mauritania 61 60 48 60 64 
Swaziland 62 59 82 64 
Burkina Faso 63 76 50 66 64 
Bhutan 64 3 
Nepal 65 51 61 70 64 
Rwanda 66 63 9 3 
Niger 67 79 78 19 
Equatorial Guinea 68 42 62 19 
Gambia, The 69 50 19 
Central African Republic 70 75 67 19 
Kuwait 71 1 48 51 64 
Zimbabwe 72 58 57 47 19 
Uganda 73 54 31 21 19 
Benin 74 67 73 69 64 
Algeria 75 64 
Bahrain 76 4 76 64 64 
Mozambique 77 71 47 16 64 


Continued on next page 
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Table 1.13 — continued from previous page 


GEM GEM GGG WOSOC 


(revised) 


SIGI GDI GGI 


(capped) 


Togo 64 
Congo, Dem. Rep. 79 73 60 64 
Papua New Guinea 80 50 22 19 
Cameroon 81 55 54 65 64 
Egypt, Arab Rep. 82 32 31 68 64 
China 83 20 13 35 64 
Gabon 84 64 
Zambia 85 69 64 54 64 
Nigeria 86 64 66 59 64 
Liberia 87 68 19 
Guinea 88 65 58 19 
Ethiopia 89 62 64 
Bangladesh 90 49 52 27 27 53 64 
Libya 9] 69 64 
United Arab Emirates 92 8 74 30 32 57 64 
Iraq 93 84 64 
Pakistan 94 51 81 26 28 71 64 
Iran, Islamic Rep. 95 27 54 31 30 67 64 
India 96 44 T] 63 19 
Chad 97 74 75 12 64 
Yemen 98 62 83 33 33 73 64 
Mali 99 77 53 61 19 
Sierra Leone 100 78 71 64 
Afghanistan 101 85 19 
Sudan 102 56 79 64 
Number of obs. 102 79 85 33 33 73 99 


Data for the Gender-related development Index (GDI) and the Gender Empowerment Measure (GEM) are 
from United Nations Development Programme (2006) and are based on the year 2004. The Gender Gap 
Index (GGI) capped and the revised Gender Empowerment Measure (GEM revised) are taken from Klasen and 
Schüler (2009) based on the year 2004. Data for the Global Gender Gap Index (GGG) are from Hausmann 
et al. (2007). The Women's Social Rights Index (WOSOC) data correspond to the year 2007 and are obtained 
from http://ciri.binghamton.edu/. 


1. THE SOCIAL INSTITUTIONS AND GENDER INDEX (SIGI) 


Table 1.14: Linear Regression with Dependent Variables GGG and Ratio GDI to HDI 


GGG Ratio of 


GDI to HDI 
coef/se coef/se 

SIGI -0.284*** -0.054*** 
(0.089) (0.017) 

log GDP 0.014* 0.004 
(0.008) (0.003) 

SA -0.006 -0.001 
(0.032) (0.008) 

ECA -0.012 0.007 
(0.017) (0.005) 

LAC -0.040** -0.000 
(0.017) (0.005) 

MENA -0.043 0.001 
(0.028) (0.011) 

EAP 0.005 0.010** 
(0.022) (0.005) 

Muslim -0.001 -0.002 
(0.018) (0.006) 

Christian 0.026 0.002 
(0.017) (0.004) 

constant 0.570*** 0.960*** 
(0.063) (0.020) 

Number of obs. 73 79 
Adjusted R2 0.617 0.438 


Prob F 0.000 0.000 


HC3 robust standard errors in brackets. 
Note: *** p<0.01, ** p«0.05, * p<0.1 
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1.8 Figures 


Figure 1.1: MJCA for the Dimension Family Code 


Parauth.1 


Inher.1 


Earmarr stands for the variables Early marriage, Polyg for Polygamy, Parauth is the variable Parental 
authority and Inher is the variable inheritance. See section 1.2. 
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Figure 1.2: MJCA for the Dimension Civil Liberties 


Oblivell.0.5 


Freemov stands for the variable Freedom of movement. Obliveil is the variable Freedom of dress. 


See section 1.2. 
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Figure 1.3: MJCA for the Dimension Physical Integrity with Missing Women 


Femmut.0.5 


Femmut stands for the variable Female Genital Mutilation, Vio for Violence against women and Misswom is 
the variable Missing women. See section 1.2. 
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Figure 1.4: MJCA for the Dimension Physical Integrity without Missing Women 


Femmut.0.5 


Femmut stands for the variable Female Genital Mutilation and Vio for Violence against women. See 
section 1.2. 
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Figure 1.5: MJCA for the Dimension Ownership Rights 


Womland.1 


Womland stands for the variable Women's access to land. womloan is the variable Women's access 
to loans and Womprop is the variable Women's access to property other than land. See section 1.2. 


Essay 2 


Why we should all care about social 


institutions related to gender inequality 


Abstract 


Institutions are a major factor explaining development outcomes. This study focuses on so- 
cial institutions related to gender inequality understood as long-lasting norms, values and 
codes of conduct that shape gender roles, and presents evidence on why they matter for de- 
velopment. We derive hypotheses from existing theories and empirically test them at the 
cross-country level with linear regressions using the newly created Social Institutions and 
Gender Index (SIGT) and its subindices as measures for social institutions. We find that apart 
from geography, political system, religion, and the level of economic development, one has 
to consider social institutions related to gender inequality to better account for differences 
in development. Our results show that social institutions that deprive women of their auton- 
omy and bargaining power in the household, or that increase the private costs and reduce the 
private returns to investments into girls, are associated with lower female education, higher 
fertility rates and higher child mortality. Moreover, social institutions related to gender in- 
equality are negatively associated with governance measured as ‘rule of law’ and ‘voice and 


accountability'. 


Based on joint work with Stephan Klasen and Maria Ziegler. 


50 2. WHY CARE ABOUT SOCIAL INST. RELATED TO GENDER INEQ. 
2. Introduction 


Institutions are a major factor explaining development outcomes. They guide human be- 
havior and shape human interaction (North, 1990). Institutions are humanly devised to re- 
duce uncertainty and transaction cost, they are rooted in culture and history and sometimes 
they are taken-for-granted and become beliefs (Hall and Taylor, 1996; De Soysa and Jütting, 
2007). Our study centers on a special type of institutions and their explanatory value for 
development outcomes: social institutions related to gender inequality. 


It is an established fact that gender inequalities come at a cost. Besides the consequences 
that the affected women experience because they are deprived of their basic freedoms (Sen, 
1999), gender inequalities affect the whole society. They can lead to ill-health, low human 
capital, bad governance and lower economic growth (e.g. World Bank, 2001; Klasen, 2002). 
Gender inequalities can be observed in outcomes like education, health and economic and 
political participation, but they are rooted in gender roles that evolve from institutions that 
shape everyday life and form role models that people try to fulfill and satisfy. We refer to 
these long-lasting norms, values and codes of conduct as social institutions related to gender 
inequality. 


We investigate the impact of these social institutions related to gender inequality on de- 
velopment outcomes, controlling for relevant determinants such as religion, political system, 
geography and the level of economic development. As development outcomes we choose 
indicators from the fields of education, demographics, health and governance. In particu- 
lar, we use female secondary schooling, fertility rates, child mortality and governance in the 
form of ‘rule of law’ and ‘voice and accountability’. We choose these indicators as they are 
related to economic development and allow us to find out whether social institutions related 
to gender inequality hinder progress in reaching the Millennium Development Goals.! 


Most of the studies that have a similar research focus are conducted at the household level 
and proxy social institutions related to gender with measures of the autonomy or status of 
women (e.g. Abadian, 1996; Hindin, 2000). At the cross-country level data are scarce and 
therefore only a few studies are available that center on the development impact of gender- 
relevant social institutions (e.g. Morrison and Jütting, 2005; Jütting et al., 2008). 


Using the Social Institutions and Gender Index (SIGI) and its five subindices Family code, 
Civil liberties, Physical integrity, Son preference and Ownership rights proposed in Essay 1, 


lIn particular, goal 3 “Promote gender equality and empower women”, goal 4 “Reduce child mortality” and 
goal 5 "Improve maternal health" are relevant here, although the other goals can be at least indirectly linked 
to our chosen indicators. 
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we investigate whether social institutions related to gender inequality are associated with the 
chosen development outcomes at the cross-country level.” 

These indices cover between 102 and 123 developing countries and are built out of twelve 
variables of the OECD Gender, Institutions and Development Database that proxy social in- 
stitutions through prevalence rates, indicators of social practices and legal indicators (Mor- 
rison and Jütting, 2005; Jütting et al., 2008) The five subindices of the SIGI each mea- 
sure one dimension of social institutions related to gender inequality. The Family code 
subindex captures institutions that directly influence the decision-making power of women 
in the household. It is composed of four variables that measure whether women have the right 
to be the legal guardian of a child during marriage and whether women have custody rights 
over a child after divorce, whether there are formal inheritance rights for wives, the percent- 
age of girls between 15 and 19 years of age who are/have been married, and the acceptance 
of polygamy in the population. The Civil liberties subindex covers the freedom of social 
participation of women and combines two variables, freedom of movement and freedom of 
dress, which measure the level of restrictions women face in moving freely outside their own 
household, and the extent to which women are obliged to follow a certain dress code in pub- 
lic. The Physical integrity dimension comprises two indicators of violence against women, 
the existence of laws against domestic and sexual violence and the percentage of women 
who have undergone female genital mutilation. The subindex Son preference measures the 
economic valuation of women and is based on a ‘missing women’ variable that measures an 
extreme form of preferring boys over girls based on information about the female population 
that has died as a result of gender inequality. The last subindex Ownership rights covers the 
access of women to several types of property: land, credits and property other than land. The 
values of the SIGI and of all the subindices are between 0 and 1. The value 0 means no or 
very low inequality and the value ! indicates high inequality. 

The SIGI combines the five subindices into a multidimensional measure of deprivation of 
women in a country. The underlying methodology of construction is inspired by the Foster- 
2As discussed in Essay 1, an alternative measure of social institutions would be the Women’s Social Rights 

index (WOSOC) of the CIRI Human Rights Data Project (http: //ciri.binghamton.edu/), which 
measures from a human rights perspective the type of institutions we are interested in. We prefer to work 
with the SIGI and its subindices and not with WOSOC as the latter also covers outcomes of these institutions 
and does not allow one to differentiate between dimensions of social institutions, e.g. between what happens 


within the family and what happens in public life. Moreover, WOSOC can only take four values, from 0 to 3, 
which makes it difficult to compare countries as there are many ties in the data. 

3The data are available at the web-pages http: / /www.wikigender.org and 
http: //www.oecd.org/dev/gender/gid. 

4To extract the common information of the variables used to construct one subindex the method of polychoric 
principal component analysis is used (Kolenikov and Angeles, 2009). 

Countries where this information is not available are assigned scores based on the legality of polygamy. 
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Greer-Thorbecke poverty measures (Foster et al., 1984). It leads to penalization of high 
inequality in each dimension and allows for only partial compensation between dimensions. 
The value of the SIGI is calculated as follows: 


1 1 
SIGI = 5 (Subindex Family Code)? + 5 (Subindex Civil Liberties)? 
+ Р (Subindex Physical Integrity)? + : (Subindex Son preference)? 
+ : (Subindex Ownership Rights )? (2.1) 


The main shortcoming of these indices is that they cover only developing countries. This 
is due to the fact that the variables used as input do not measure relevant social institutions 
related to gender inequalities in OECD countries. Further research is required to develop 
appropriate measures for developed countries. Nevertheless, these social institutions indica- 
tors are innovative measures of the social, economic and political valuation of women and 
add information to other existing measures of gender inequality in well-being and empower- 
ment.° The SIGI and its subindices focus on the roots of gender inequality in a society and 
not on gender inequality in outcomes. The ranking of countries according to the SIGI and its 
subindices is presented in Table 1.10. 

We proceed as follows. First, we look for relevant theories linking — at least implicitly 
— social institutions related to gender inequality with development outcomes such as health, 
demographics, education and the governance of a society. We refer to bargaining household 
models (e.g. Manser and Brown, 1980; McElroy and Horney, 1981; Lundberg and Pollak, 
1993) and models considering the costs and returns of children (e.g. Becker, 1981; King and 
Hill, 1993; Hill and King, 1995) as well as to contributions from several disciplines on gov- 
ernance and democracy. These contributions focus on differences in behavior between men 
and women, and on women's movements countervailing power to personal rule (e.g. Swamy 
et al., 2001; Tripp, 2001). Secondly, we run several linear regressions with the outcome 
indicators as dependent variables and the SIGI and its subindices as the main explanatory 
variables. Our results show that social institutions related to gender inequality matter; higher 
inequality in social institutions is associated with lower development outcomes. In a related 
paper, Jütting et al. (2010) follow the same econometric procedure we use here and study the 
impact of the SIGI and its subindices on gender inequality on labor market outcomes. 

The rest of the paper consists of 5 sections. In section 2.2 we review existing theory 
on household decision-making and incorporate social institutions into the models, deriving 
SExamples are the Gender-Related Development Index (GDI) and the Gender Empowerment Measure (GEM) 

from United Nations Development Programme (1995), the Global Gender Gap Index from the World Eco- 

nomic Forum (Lopez-Claros and Zahidi, 2005), the Gender Equity Index developed by Social Watch (Social 


Watch, 2005), and the African Gender Status Index proposed by the Economic Commission for Africa (Eco- 
nomic Commission for Africa, 2004). 
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hypotheses on their impact on female education, fertility and child mortality. In section 2.3, 
we formulate hypotheses on the impact of social institutions on ‘rule of law’, and ‘voice 
and accountability! based on the literature on governance, democracy and gender. Data is 
described in section 2.4. The empirical estimation and the results are presented in section 


2.5. Section 2.6 concludes. 


2.2 Social Institutions and Household Decisions 


In this section, we review the existing literature about the potentials effects of social institu- 
tions related to gender inequality on development outcomes. It is beyond the scope of this 
study to develop a formal model that incorporates social institutions and specifies the ex- 
act functional relationships. Instead, we use the non-unitary approach to the household and 
the method of Net Present Value which give hints on how social institutions operate at the 
household level. These approaches provide the necessary micro-foundation for the empirical 
analysis which can only be conducted at the macro-level because of the available data. 

Non-unitary household models show that household decisions are the result of the distri- 
bution of bargaining power in the household. Common to the non-unitary models, initiated 
by Manser and Brown (1980) and McElroy and Horney (1981), is a game-theoretic approach 
to the household. Husband and wife have their own utility function, U^(c^) for the husband 
and U(c”) for the wife, that depend each on the consumption of private goods c.” They bar- 
gain over the allocation of resources to maximize their utility. In the case they do not reach 
agreement they receive a payoff which corresponds to an individual *threat point', P^(S,Z) 
and P"(S,Z) which comprises the utilities associated with non-agreement.? S and Z are 
defined below. The implication of non-unitary models is that household members do not 
simply pool resources and that inequality in power may cause inequality in outcomes (Kan- 
bur, 2003; Pollak, 2003, 2007; Lundberg and Pollak, 2008).? Empirical evidence supports 
this (e.g. Thomas, 1997; Schultz, 1990; Haddad and Hoddinott, 1994; Rasul, 2008). 


7Certainly, there are public goods in the household that both husband and wife consume within the marriage. 
The threat point may be external to the marriage. In this case it corresponds to the individual's utility outside 


the family in case of divorce, as it is modeled in the divorce threat models of Manser and Brown (1980) and 
McElroy and Horney (1981). In the separate spheres bargaining models of Lundberg and Pollak (1993) the 
threat point is internal to the marriage and is the utility associated with a non-cooperative equilibrium within 
marriage given by traditional gender roles and social norms, where the spouses receive benefits due to the 
joint consumption of public goods. 

?Using Nash-Bargaining a solution to these non-unitary models can be found. Husband and wife maximize the 
Nash product function N = [U^(c^ — P^(S, Z)|[U” (c" — P” (S, Z)], that is subject to a pooled budget constraint. 
The result is the demand function c' = f (p, y, S, Z) with p for prices, y for total household income and i = w, h 
(Lundberg and Pollak, 2008). 
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If husband and wife have to take decisions about their sons and daughters which will 
affect the future then time needs to be considered. The method of the Net Present Value 
(NPV) allows to take into account present and future costs and returns to investments. To 
simplify the illustration we ignore that bargaining takes place and name the decision-maker 
*parents'. The maximization of utility in a multi-period model leads parents to consider 
the costs and returns of their investment in their children (e.g. King and Hill, 1993). This 
private calculation of parents at period t = 0 can then be represented with the NPV of the 
investment in a child, with NPV = Y7 4 EET TASA where 7 is the number of time periods 
considered, R represents the returns, K the costs of investments in a child, and r represents 
the discount rate. Like the threat point P in the non-unitary models, R and K are functions 
of S and Z that will be explained below. If the NPV is positive parents decide to invest in 
a child. Gender inequality in the investments in boys and girls arises if the NPV of boys is 
larger then the one of girls.!° 

Finally, let us explain S and Z. S can be defined as ‘extrahousehold environmental pa- 
rameters’ (McElroy, 1990) or ‘gender-specific environmental parameters’ (Folbre, 1997) that 
influence the threat point in the non-unitary household models and the NPV of a child. We 
consider that S can be best described as social institutions related to gender inequality. Z 
represents all other influential factors besides S. 


2.2.1 Social Institutions and Female Education 


The following examples illustrate how social institutions related to gender inequality affect 
the private costs and returns of educational investments.!! Social institutions related to gen- 
der inequality influence the costs of education as they shape a gendered division of labor and 
the opportunity costs of educating girls. Opportunity costs include income from child labor 
and are higher for girls when they are expected to do housework, to care for their younger 
siblings or to work in agriculture (Hill and King, 1995; Lahiri and Self, 2007). Social in- 
stitutions related to gender inequality also affect the returns to education. The returns are 
generally lower for girls than for boys because girls and women are discriminated on the 
labor market in the form of entry restrictions and wage gaps. Thus, boys are expected to 
be economically more productive. Furthermore, parents often expect only low returns from 


10See Pasqua (2005) who considers both perspectives, the non-unitary approach to the household and the cost 


and returns approach in the case of education of girls. 
!!Tt must be noted that the private NPV of investments in the education of children does not correspond to the 


social NPV. Social returns to education, especially female education, are often higher than the private ones. 
There is evidence that society benefits from female education as it contributes to overall development and 
drives economic growth (Hill and King, 1995; Klasen, 2002; Braunstein, 2007; Klasen and Lamanna, 2009). 
The resulting investment in female education will then often be sub-optimal. 
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female education because the daughter marries and leaves the house implying that the family 
loses her labor force. As a consequence sons become the building block of their parents' 
old-age security (Hill and King, 1995; Pasqua, 2005; Song et al., 2006).*? 

The costs and returns perspective does not rule out that the distribution of decision-making 
power in the household matters. The non-unitary household approach can be used to explain 
low female education (Pasqua, 2005). Several empirical studies show that when women dis- 
pose of more resources, investments in the education of girls are higher (e.g. Schultz, 2004; 


Emerson and Souza, 2007). 


Hypothesis 1: Social institutions that deprive women of their autonomy and bargaining 
power in the household or that increase the private costs and reduce the private returns 
to investments into female education are associated with lower female education than in a 


more egalitarian environment. 


2.2.2 Social Institutions and Fertility and Child Mortality Rates 


Social institutions related to gender inequality that influence female decision-making power 
in the household and the NPV of the investment in girls in comparison to boys are also 
relevant for fertility levels and child mortality. 

Concerning fertility, one can use the non-unitary household approach and argue that the 
net utility of a woman associated with getting a child might differ from that of a man. If one 
assumes that man and woman derive the same satisfaction of having a child, the net utility 
a woman derives is lower than the one of the man as she bears most of the costs of having 
children. These costs are related to the discomfort and health risks related to pregnancy, and 
the income losses associated with time spent on child care. This might explain why women 
want less children than men, but cannot achieve their objectives as social institutions restrict 
their power in limiting the number of children born. Empirical studies support the hypothesis 
that reduced female bargaining power leads to shorter time spans between births, a lower use 
of contraceptives and higher fertility levels (Thomas, 1990; Abadian, 1996; Hindin, 2000; 
Saleem and Bobak, 2005; Seebens, 2008). 

The perspective of the NPV provides a second explanation for higher fertility. In the 
absence of well-functioning insurance markets and pension systems, parents in developing 
countries may need more children to feel secure. Depending on the costs of a child and 


12In addition to all of these considerations, social institutions related to gender inequality might affect the 
supply of schooling which might influence the decision to send girls to school if school environments are 
hostile to the needs of girls (e.g. no female teachers available, long distances to school or prices in favor of 
boys) (Hill and King, 1995; Alderman et al., 1996; Pasqua, 2005; Lahiri and Self, 2007). 
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the returns to the investment in a child parents will consider to get more children. As it was 
explained in the previous subsection on female education, social institutions related to gender 
inequality affect the NPV of investments in children. If these social institutions lower income 
earning opportunities for girls, the NPV of investments in girls will be lower than the NPV of 
investments in boys. Hence, sons yield the promise of more economic security as compared 
to daughters. As long as parents cannot perfectly control the sex of their offspring, they will 
bear more children to increase the chance of having more sons (Cain, 1984; Abadian, 1996; 
Kazianga and Klonner, 2009). 


To explain higher child mortality levels with social institutions that disadvantage women 
one has to bear in mind that mothers are usually the primary caregivers of children. Within 
the non-unitary framework, if mothers have only limited power in the household, they are 
constrained in the use of health care or in the access to food and other goods necessary 
for children. Thus, they cannot take care of their children as they would without those 
restrictions. This might lead to worse child health and higher child mortality rates (Thomas, 
1990, 1997; Bloom et al., 2001; Smith et al., 2002; Maitra, 2004; Shroff et al., 2009). 


From the NPV perspective it might be rational for parents to invest more in the health and 
nutrition of boys than in girls who as a consequence could suffer more heavily from health 
problems and experience higher mortality rates than boys. It is possible that this behavior in- 
creases overall child mortality rates. In addition, the limited education that women typically 
receive in patriarchal societies as a result of past NPV calculations of their parents might 
also lead to worse child health and to higher child mortality figures (Schultz, 2002; Shroff 
et al., 2009). 


Hypothesis 2: Social institutions that deprive women of their autonomy and bargaining 
power in the household or that increase the private costs and reduce the private returns 
of investments into girls are associated with higher fertility levels than in an egalitarian en- 


vironment. 


Hypothesis 3: Social institutions that deprive women of their autonomy and bargaining 
power in the household or that increase the private costs and reduce the private returns 
of investments into girls are associated with higher child mortality than in an egalitarian 


environment. 
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2.3 Social Institutions and the Society: Governance 


In societies where social institutions limit the rights of women, and where women's place is 
restricted to the private sphere, they have no or less say in the public and political domain. 
What is the impact of social institutions related to gender inequality on governance? We use 
Kaufmann et al. (2008, p. 7)’s definition of governance “as the traditions and institutions by 
which authority in a country is exercised. This includes the process by which governments 
are selected, monitored and replaced; the capacity of the government to effectively formulate 
and implement sound policies; and the respect of citizens and the state for the institutions 
that govern economic and social interactions among them." 

There are at least two approaches that allow to link social institutions with governance. 
First, there exist psychological and sociological explanations that state that women are less 
egoistic than men. Women are more risk-averse, they tend to follow the rules and they are 
more community-oriented than men (Dollar et al., 2001; Swamy et al., 2001). Countries in 
which women have more power will have a political system that is more rule-oriented, re- 
sponsive and accountable. Second, women's movements, being the answer to the exclusion 
of women from power, play an important role in increasing the quality of political systems 
by challenging e.g. personal rule (Waylen, 1993; Tripp, 2001). This argumentation suggests 
that countries with social institutions that hinder women to organize and to express their 
interests might lack an important oppositional force and therefore have a bad quality of gov- 


ernance. 


Hypothesis 4: Social institutions related to high gender inequality inhibit the building blocks 
of good governance. In societies with social institutions favoring gender inequality political 
systems will be less responsive and less open to the citizens, so that 'voice and accountabil- 
ity’ will be reduced. 


Hypothesis 5: Social institutions related to high gender inequality inhibit the building blocks 
of good governance. In societies with social institutions favoring gender inequality there 
might be more personal rule in the political system as well as inequality in justice and legal 


systems, so that the ‘rule of law’ will be weakened. 


2.4 Data 


Our investigation uses macro-data at the country level. Table 2.1 gives an overview over the 
variables used for our estimations, the definitions and the data sources. Descriptive statistics 
of the variables used are presented in Table 2.2. As main regressors we use the SIGI and its 


58 2. WHY CARE ABOUT SOCIAL INST. RELATED TO GENDER INEQ. 


five subindices Family code, Civil liberties, Physical integrity, Son Preference and Owner- 
ship rights in our estimations to check their explanatory value for the development outcomes 
female education, fertility, child mortality and governance. 

First, we are interested in the impact of social institutions on female education, fertility 
and child mortality. As dependent variables we use total fertility rates from World Bank 
(20092) and child mortality rates from World Bank (2008). To measure education we choose 
female gross secondary school enrollment rates because this enables important functionings 
and empowers women. Furthermore we assume that parents take into account that basic 
education of both boys and girls is necessary for fulfilling tasks related to the household. 
Data for secondary school enrollment are from World Bank (20092). 

Second, we want to estimate the association between governance and our social institu- 
tions measures. We use the Governance Indicators developed by Kaufmann et al. (2008) 
and choose two of them to capture equality before the law, justice, tolerance and security 
as well as responsiveness, political openness and accountability in the political system. The 
rule of law index measures the extent to which contracts are enforced and property rights are 
ensured and the extent to which people trust in the state and respect the rules of the soci- 
ety. The voice and accountability index proxies civil and political liberties like freedom of 
expression, freedom of association, free media and the extent of active and passive political 
participation of citizens. 

In all regressions we control for the level of economic development, religion, region and 


the political system in a country. The specific variables we use are: 


e the log of per capita GDP in constant prices to control for the level of economic devel- 
opment (US$, PPP, base year: 2005); 


e a Muslim majority and a Christian majority dummy to control for the impact of reli- 
gion, the left-out category being countries that have neither a majority of Muslim nor 
a majority of Christian population; 


e region dummies to capture geography and other unexplained heterogeneity that might 
go together with region, the left-out category being Sub-Saharan Africa; 


e two political institutions variables, the electoral democracy variable and the civil liber- 
ties index from Freedom House (2008) that together measure liberal democracy which 
is assumed to be related to responsiveness to the needs of the public, political openness 
and tolerance in a country. ? 


We use different additional control variables in each regression following suggestions in 


the literature. In the fertility and child mortality regressions, we additionally control for 
13 We multiply the civil liberties index by -1 to facilitate interpretation. 
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e female literacy rates to measure the ability of women to control their reproductive 
behavior, to care for themselves and their children (e.g. Basu, 2002; Hatt and Waters, 
2006; Lay and Robilliard, 2009); 


e a dummy proxying for high HIV/AIDS prevalence rates to control for extreme health 
problems especially in Sub-Saharan Africa due to AIDS (e.g. Foster and Williamson, 
2000). 


The Governance regressions exclude as control variables the civil liberties index from 
Freedom House as this index is used to build the voice and accountability index that we 
choose as dependent variable. We keep the electoral democracy variable because it does not 


pose a problem. We additionally include as control variables 


e the share of literate adult population to control for the population's ability to be in- 
formed, to express their needs and to hold politicians' accountable (Keefer and Khe- 
mani, 2005); 


e ethnic fractionalization as it might disturb governance through identity politics, pa- 
tronage and distribution conflicts (e.g. Collier, 2001; Tripp, 2001); 


e a measure of trade openness as openness increases the incentives to build ‘good’ in- 
stitutions to attract trading partners, to join trading agreements etc. (e.g. Al-Marhubi, 
2005). 


Social institutions, i.e. normative frameworks, change only slowly and incrementally. As 
the social institutions indicators are not expected to change much over time we have to decide 
which year or time span should be covered by the other variables. For our response variables 
we choose to take the average of the existing values over five or six years (2000-2005, 2001- 
2005). For the control variables we take the averages of the existing values over ten years 
(1996-2005).!^ The averages provide information that is more stable than using a particular 
year. Using a longer time span for the control variables than for the response variables allows 
to capture possible time delays until effects can be observed. Nevertheless, we acknowledge 
that the choice of the time spans is arbitrary. 


14The ethnic fractionalization variable is constant over time as changes in the ethnic composition of a country 
at least over 20 and 30 years are rare. 
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2.5 Empirical estimation and Results 


2.5.1 Empirical estimation 


We empirically test with linear regressions whether the composite measures reflecting social 
institutions related to gender inequality s; are associated with each of the response variables 


yi, representing the chosen development outcomes. We estimate regressions in the form 
yi = Y + Bs; + control variables; + €; (2.2) 


using information at the country level. We are mainly interested in testing the null hypoth- 
esis that the coefficient ß is zero at a statistical significance level of & = 10%. If the null 
hypothesis is rejected, it is reasonable to infer that the measure proxying social institutions 
related to gender inequality does matter for the given response variable, as predicted in the 
hypotheses from sections 2.2 and 2.3. 

The general procedure used for each of the response variables consists of two steps. First, 
we start examining the effect of SIGI. We begin our estimation with a simple linear regression 
with SIGI as the only regressor s;. We then run a multiple linear regression adding the 
main group of control variables that consists of the level of economic development, region 
dummies, religion dummies and the political system variables. If SIGI is significant in this 
regression, we continue and, if applicable, estimate the complete model with all control 
variables to confirm whether SIGI remains significant. 

As SIGI is a rather broad measure to rank and compare countries and policy implications 
are difficult to derive from it, in a second step we focus on the subindices to get a more 
precise idea about what kind of social institutions might be related to the chosen development 
outcomes. We estimate the same multiple linear regression(s) described above using the 
five subindices as s; one at a time instead of SIGI to explore which dimension of social 
institutions related to gender inequality seems to be the most relevant. In the corresponding 
regression tables we only report the specification with the subindex or subindices that are 
statistically significant. It must be noted that we keep and show even those control variables 
that are not statistically significant in the regression, as we want to stress that the social 
institutions indices are associated with the development outcomes even if we include these 
control variables. 

All regressions are estimated with Ordinary Least Squares (OLS). Regression diagnos- 
tics not reported here suggest that heteroscedasticity is a possible issue in our data and that 
there are influential observations that could drive our results. Concerning the first issue, it 
is known that if the model is well specified, the OLS estimator of the regression parameters 
remains unbiased in the presence of heteroscedasticity, but the estimator of the covariance 


2.5. EMPIRICAL ESTIMATION AND RESULTS 61 


matrix of the parameter estimates can be biased and inconsistent making inference about 
the estimated regression parameters problematic. Violations of homoscedasticity can lead to 
hypothesis tests that are not valid and confidence intervals that are either too narrow or too 
wide. To deal with heteroscedasticity, we use ‘heteroscedasticity-consistent’ (HC) standard 
errors. This means that while the parameters are still estimated with OLS, alternative meth- 
ods of estimating the standard errors that do not assume homoscedasticity are applied. As 
the samples we use contain less than 150 observations, we use HC3 robust standard errors 
proposed by Davidson and MacKinnon (1993), which are better in the case of small samples. 
These are the standard errors that are presented in the regression Tables 2.5-2.9. Simulation 
studies by Long and Ervin (2000) have shown that HC standard error estimates tend to main- 
tain test size closer to the nominal alpha level in the presence of heteroscedasticity than OLS 
standard error estimates that assume homoscedasticity. These authors recommend the use of 
HC3 robust standard errors, especially for sample sizes less than 250, as they can keep the 
test size at the nominal level regardless of the presence or absence of heteroscedasticity, with 


only a minor loss of power associated when the errors are indeed homoscedastic.!? 


In addition to this, we also use bootstrap with 1000 replications to compute a Bias- 
corrected and accelerated (Bca) 9096 confidence interval of the regression coefficients com- 
puted with OLS (Efron and Tibshirani, 1993). One of the main advantages of bootstrapping 
methods is that no assumptions about the sampling distribution or about the statistic are 
needed. The results are not reported here, but are available upon request, and confirm that all 
the coefficients that are significant in Tables 2.5-2.9 remain significant when using Bca 9096 
confidence intervals around them. 


To deal with the second issue and check whether influential observations drive the results, 
we take the estimates of a regression obtained with OLS with standard variance estimator to 
detect the observations with unusual influence or leverage based on Cook's distance. Cook's 
distance is a commonly used estimate of the influence of a data point when doing least 
squares regression. We exclude countries from the sample if the value of Cook's distance is 
larger than 4/n, with n being the number of observations, and re-estimate each regression 
on the restricted sample with HC3 robust standard errors. In all the cases we confirm that 
even after we exclude influential observations, the results remain basically unchanged.!? The 


regressions are not reported here, but are available upon request. 


15Certainly, heteroscedasticity-consistent standard errors are not a panacea for inferential problems under het- 
eroscedasticity. As pointed out by some authors, there are limitations and trade-offs in these estimators (e.g. 


Kauermann and Carroll, 2001; Wilcox, 2001). 
16 As an alternative procedure we use robust regression with iteratively reweighted least squares as described in 


Hamilton (1992), and confirm that results are similar. 
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We consider that the model specification is reasonable. However, possible endogeneity 
of our main regressors s; (the SIGI and its subindices) should be taken into account when 
interpreting the coefficients of s; as they would be biased and inconsistent in this case. En- 
dogeneity is given if s; is correlated with the disturbance e; in equation 2.2. There are three 
sources of endogeneity: omitted variables, measurement error and simultaneity (Wooldridge, 
2002). We have included control variables to minimize omitted variable bias, although it is 
impossible to completely rule out this problem. Concerning measurement error, we regard 
the SIGI and the subindices as adequate proxies of social institutions related to gender in- 
equality. It is not very plausible that there are errors in measurement that are related to the 
unobserved social institutions. The last source, simultaneity, arises when s; is determined 
simultaneously with y;. As was discussed in section 2.1, we consider that social institutions 
related to gender inequality s; are relatively stable and long-lasting. Therefore, we think it is 


unlikely that the response variables y; influence s;.!7 


2.5.20 Results 


Before we run the regressions it is necessary to check first the correlation between the 
subindices to rule out redundancy, and secondly between the subindices and the control 
variables to check whether the social institutions indices are proxies for these control vari- 
ables. The Pearson correlation coefficient between the subindices is always positive, but not 
always significant (Table 2.3). The correlation coefficients are always lower than 0.6, with 
the exception of the correlation between the subindices Family Code and Ownership rights, 
which is equal to 0.75.18 Table 2.4 shows that the absolute value of the Pearson correlation 
coefficient between the social institutions indicators and the control variables is always lower 
than 0.6, except for the SIGI and the subindices Family code and Ownership rights and the 
two variables capturing literacy of the whole population and of the female population. 
Regression results using female secondary education as dependent variable are presented 
in Table 2.5. Regression (1) with SIGI as the only regressor yields a negative and statistically 
significant association. Higher levels of inequality are associated with lower levels of female 


Social institutions are hard to measure. Therefore, sometimes one has to rely on legal indicators to proxy 
them, although we acknowledge that this could pose problems as there is for example an international mech- 
anism, the Convention on the Elimination of All Forms of Discrimination against Women (CEDAW), that 
aims at changing social institutions through legal measures. However, the impact of CEDAW on national 
legislation depends on the willingness of governments to sign and ratify it without reservation and on its will- 
ingness and ability to enact the new laws. Given the constituting function of social institutions for a society 


this could be difficult and depends on many factors. 
\8Table 1.7 shows Kendall tau b between the five subindices and confirms that they are positively correlated, 


albeit not perfectly. 
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secondary education. The association vanishes in regression (2) if one includes the level of 
economic development, religion, region and the political system as control variables. Using 
the subindex Family code instead of SIGI as the main regressor in regression (3) shows a 
different picture. The subindex is statistically significant even if the control variables are 
included. The adjusted coefficient of determination R? is 0.78. Hence, we find no evidence 
against Hypothesis 1 that states that social institutions related to high gender inequality are 
negatively associated with female education.!? 

Results obtained using total fertility rate and child mortality as response variables are 
shown in Tables 2.6 and 2.7. In both cases, the simple linear regression (1) using SIGI as the 
only regressor shows a positive and significant statistical association between SIGI and the 
response variable. Higher levels of inequality are associated with higher levels of fertility and 
with higher levels of child mortality. However, once control variables related to the level of 
economic development, religion, region and the political system in a country are included in 
regression (2), SIGI is not longer statistically significant. This is not the case when we use the 
subindex Family code as the main regressor, as it is significant in regression (3) which uses 
the same control variables, and even in regression (4) which adds two additional regressors: 
the share of literate adult female population and a dummy reflecting high adult HIV/AIDS 
prevalence. In regression (4) the obtained adjusted R? is 0.84 for fertility and 0.82 for child 
mortality. Hence, we cannot reject Hypotheses 2 and 3, suggesting that social institutions 
related to high gender inequality are associated with higher fertility levels and higher child 
mortality.2° As the subindex Family code is the relevant social institutions measure in our 
empirical estimations it seems that social institutions that deprive women of their autonomy 
and bargaining power in the family and that might restrict women's possibilities outside the 
family do matter for female education, fertility and child mortality. 

Table 2.8 shows the results obtained for the dependent variable voice and accountability. 
Regression (1) with SIGI as the only regressor shows a negative and statistically significant 
association: higher levels of gender inequality are associated with lower levels of voice and 
accountability. This association remains significant in regression (2) where we add the level 
of economic development, religion, region and the political system?! as control variables, 
and in the complete specification shown in regression (3) where we additionally include the 


PRegressions not reported here, but available upon request, using primary gross completion rates obtained from 


World Bank (2008) instead of female secondary schooling as the dependent variable yield similar results. 
20Regressions not shown here, but available upon request, confirm that the results concerning mortality rates 


hold when using infant mortality rates taken from World Bank (2008) instead of child mortality rates as the 


dependent variable. 
?! Recall that in the governance regressions we only include the electoral democracy variable of Freedom House 


(2008) as the civil liberties index is included in the chosen governance indicators which are now the response 
variables. 
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proportion of seats held by women in national parliaments, the literacy rate of the popula- 
tion, a measure of openness of the economy, and a measure of ethnic fractionalization. In 
regression (3), we obtain an adjusted R? of 0.69. We explore which dimension of social 
institutions related to gender inequality is behind this result and find that it is the subindex 
Civil liberties. The specifications with the subindex Civil liberties in regressions (4) and (5) 
show that this subindex is negatively associated with voice and accountability and that this 
association is statistically significant even with the control variables. In regression (5) the 
adjusted R? is 0.69. Hypothesis 4 cannot be rejected with this evidence suggesting that so- 
cial institutions related to gender inequality inhibit the building blocks of good governance 
in the form of voice and accountability. The subindex Civil liberties is the relevant social 
institutions measure in our empirical estimations. The freedom of women to participate in 
public life seems to increase the quality of governance of a society. Relating back to theory, 
this could be due to the behavior of women as they tend to be more socially oriented than 


men and are a group that cross-cuts cleavages in general. 


Results for the other component of governance, rule of law, are shown in Table 2.9, pro- 
viding no evidence against Hypothesis 5. Regression (1) shows a negative and statistically 
significant association between SIGI and rule of law: higher levels of inequality are associ- 
ated with lower levels of rule of law. This association remains significant in regression (2) 
where we add the level of economic development, religion, region and the political system as 
control variables, and in the complete specification in regression (3) where we additionally 
include the proportion of seats held by women in national parliaments, the literacy rate of the 
population, a measure of openness of the economy, and a measure of ethnic fractionalization. 
In this last regression, we obtain an adjusted R? of 0.51. Again, we are interested in explor- 
ing which dimension of social institutions related to gender inequality is the relevant one for 
rule of law finding that two subindices matter: Ownership rights and Civil liberties.”” The 
specifications with the subindices yield similar results to those of the SIGI and are presented 
in regressions (4) and (5) for Ownership rights and (6) and (7) for Civil liberties. For both 
subindices the adjusted R? obtained for the complete specification is 0.56. As postulated in 
Hypothesis 5, social institutions related to gender inequality seem to matter for governance 
inhibiting the rule of law, e.g. through personal rule and inequality in justice. Assuming 
that women's attitudes are different from those of men and that they countervail injustice, 
women's power in a society contributes to improve rule of law. The two subindices proxy 
where this power comes from, with Ownership rights measuring economic power through 


22 As shown in Table 2.3 the Pearson Correlation coefficient between the subindices Ownership rights and Civil 
liberties is 0.36. 
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access to property and Civil liberties measuring the freedom to participate in and to shape 
public life. 

A reasonable question is whether the social institutions indicators are capturing different 
religions. In the regressions reported here, we control for religion using a Christian and a 
Muslim dummy. As the results show, at least one subindex is significant when we control 
for religion. One could argue that what matters is how religion is practiced in the considered 
regions, and that the SIGI and the subindices might capture regional practice of religion. 
Therefore, we re-estimate all regressions including interactions between the religion and 
region dummies. The results for the SIGI and the subindices remain unchanged suggesting 
that they capture something different than religion and the regional practice of it.?? 


2.6 Conclusion 


This study presents several answers to the question why we should care about social insti- 
tutions related to gender inequality beyond the intrinsic value of gender equality. We derive 
hypotheses from existing theories and empirically test them with linear regression at the 
cross-country level using the newly created Social Institutions and Gender Index (SIGT) and 
its subindices. Our results show that social institutions related to gender inequality are as- 
sociated with lower female secondary education, higher fertility rates, higher child mortality 
and lower levels of governance measured as voice and accountability and rule of law. We 
find that apart from geography, political system, the level of economic development and reli- 
gion, one has to consider social institutions related to gender inequality to better account for 
differences in important development outcomes. 

The empirical estimation follows a two-step procedure for each outcome measure. First, 
the focus is to examine the explanatory value of the SIGI. In the specifications including 
all control variables, the SIGI is significant in the regressions for the governance measures 
‘voice and accountability’ and ‘rule of law’. If one interprets the SIGI as a summary measure 
of lack of power of women in all spheres of society then it seems that when women have more 
power, governance is better.?* In the case of female secondary schooling, fertility rate and 
child mortality the SIGI turns out to be insignificant in the complete specifications. 

Secondly, as the SIGI is a broad measure of social institutions related to gender inequal- 
ity, we investigate which particular dimension of social institutions is significantly related to 
the chosen development outcomes, using the complete specifications. The subindex Fam- 


ily code is negatively associated with female education, fertility and child mortality. These 


23The results are available upon request. 
The association between two composite measures like the SIGI and the governance indicators has to be 


interpreted carefully. 
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results suggest that social institutions that deprive women of their autonomy and bargain- 
ing power in the family do matter for female education, fertility and child mortality. The 
subindex Civil liberties is the dimension of social institutions that is significantly related to 
the governance component ‘voice and accountability’. The freedom of women to participate 
in public life seems to increase the quality of governance of a society as women tend to be 
more socially oriented than men and are a group that cross-cuts cleavages in general. The 
‘rule of law’ component of governance is negatively related to the subindices Civil Jiber- 
ties and Ownership rights. The two subindices proxy where this power comes from, with 
Ownership rights measuring access to property and Civil liberties measuring the freedom to 
participate in public life. Assuming that women's attitudes are different from those of men 
and that they countervail personal rule, women's power in a society is a relevant factor to 
improve ‘rule of law’. 

Although the subindices Family code, Ownership rights and Civil liberties are the rele- 
vant dimensions of social institutions related to gender inequality for the response variables 
considered in this study, this does not mean that the other two subindices Son preference and 
Physical integrity are not important intrinsically or instrumentally for other outcomes. Case 
studies investigating the mechanisms between social institutions and the outcome variables 
are necessary. Our study has the limitations of any cross-sectional regression analysis as we 
cannot rule out omitted variable bias. Causality can never be derived from regression anal- 
ysis with cross-sectional data unless valid instruments are found. Concerning the results of 
the subindices, these should be considered exploratory and need to be confirmed with further 
research, which should also include the elaboration of appropriate theories linking social in- 
stitutions related to gender inequality with each of the development outcomes used in this 
study. 

Social institutions are long-lasting and deep-seated in people’s minds. Changing them is a 
difficult task and requires approaches tailored to the particular needs and the socio-economic 
context (Jiitting and Morrisson, 2005). The state can certainly help attenuate the effects 
of social institutions through specific policies. It may set incentives to counteract social 
institutions, e.g. in the form of laws to fight against discriminatory practices or through the 
implementation of programs favoring girls and women. Micro-credit programs or subsidies 
targeted at mothers are good examples here. Nevertheless, changing social institutions needs 
more than that. It needs a thorough understanding of the power relations in a country and 
people that are willing to become reform drivers and initiate learning processes that should 
be complemented by deliberation and public discussion at all levels of society. Be it through 
internal or external forces, women need help to empower themselves. That is what Sen calls 


‘agency of women’ (Sen, 1999). 


2.7 Tables 


Response Variables 
Fertility 


Child mortality 
Female secondary school 


Voice and accountability 


Rule of law 


Regressors 
SIGI 
Subindex family code 
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Table 2.1: Description and Sources of Variables 


Total fertility rate (births per woman) 

(average of existing values over the last five years) 

Children under five mortality rate per 1,000 live births (year 2005) 
School enrollment, secondary, female (96 gross) 

(average of existing values over the last five years) 

Index that combines several data sources based 

on expert perceptions of "the extent to which a country's citizens are 
able to participate in selecting their government, as well as freedom 
of expression, freedom of association, and a free media" 

(Kaufmann et al., 2008); 

(average of existing values over the last five years) 

Index that combines several data sources based on expert 
perceptions of "the extent to which agents have confidence in and 
abide by the rules of society, and in particular the quality of contract 
enforcement, property rights, the police, and the courts, as well as 
the likelihood of crime and violence" 

(Kaufmann et al., 2008); 

(average of existing values over the last five years) 


Social Institutions and Gender Index 
Subindex Family code 


№ 
= 
= 
= 
le» 
un 

World Bank (2009a) 

World Bank (2008) 

World Bank (2009a) 

Kaufmann et al. (2008) 

Kaufmann et al. (2008) 

Branisa et al. (2009a) 

Branisa et al. (2009a) 
oO 
SQ 


Subindex civil liberties 
Subindex physical integrity 
Subindex son preference 
Subindex ownership rights 
Literacy female 

Literacy population 


log GDP 


FH civil liberties 


Electoral democracy 


Parliament 


Aids 
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Subindex Civil liberties 

Subindex Physical integrity 

Subindex Son preference 

Subindex Ownership rights 

Share of literate adult female population (15+) (%) year 2000 
(average of the existing values over the last 10 years) 

Share of literate population (whole) 

(average of the existing values over the last 10 years) 

Log of GDP per capita, PPP (constant 2005 international $) 
(average over the last 10 years) 

-] * Index that measures the extent to which countries ensure 
civil liberties including freedom of expression, assembly, association, 
education, and religion as well as personal autonomy. It covers 


whether there is an established and generally equitable system 


of rule of law, free economic activity and equality of opportunity. 
(scale -1 (best) to -7 (worst)) 

(average of the existing values over the last 10 years) 

Index that qualifies countries as electoral democracy when there 
exist competitive, universal and free and secret elections and a 
multiparty system that can access the media for political 
campaigning; (average of the existing values over the last 10 years) 
Proportion of seats held by women in national parliaments (%) 
(average of the existing values over the last 10 years) 

Adult (15-49) HIV prevalence percent by country, 1990-2007; 
Countries were coded 1 if Adult (15-49) HIV prevalence rate 


HO EA 


Branisa et al. (2009a) 

Branisa et al. (2009a) 

Branisa et al. (2009a) 

Branisa et al. (2009a) 

World Bank (2009a) 

Human Development Report (HDR) stats office 


World Bank (2008) 


Freedom House (2008) 


Freedom House (2008) 


World Bank (20092) 


UNAIDS/WHO (2008) 
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Table 2.1 — continued from previous page 
exceeds 5 per cent, otherwise 0. 
The ethnic fractionalization measure gives the probability that two 
individuals selected at random from a population are members of 


Alesina et al. (2003) 


different groups. It is calculated with data on language and origin 


using the following formula FRAC; = 1 — УУ 52, 


where s;; is the proportion of group i = 1,..., 

N in country j going from complete homogeneity (an index of 0) 
to complete heterogeneity (an index of 1). 

Share of imports of goods and services of total GDP 

Countries get a 1 if at least 50 % of the population are muslim, 


Openness 
Muslim 


World Bank (2008) 
Central Intelligence Agency (2009) 
0 otherwise. 


Christian Countries get a 1 if at least 50 % of the population are christian, Central Intelligence Agency (2009) 


0 otherwise. 


SA 


Countries get a 1 if located in region South Asia, 0 otherwise. 


Countries get a 1 if located in region Europe and Central Asia, 


0 otherwise. 


Countries get a 1 if located in region Latin America and the Caribbean, 
0 otherwise. 

Countries get a 1 if located in region Middle East and North Africa 

0 otherwise. 


Countries get a 1 if located in region East Asia and Pacific 


0 otherwise. 


SHTAVL LC 
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Table 2.2: Descriptive statistics of variables used 


Variable | Observations Mean Std. Dev. Min Max 
SIGI 102 0.126 0.122 0.002 0.678 
Subindex Family Code 112 0.326 0.223 0.004 0.797 
Subindex Civil Liberties 123 0.160 0.259 0 1 
Subindex Physical integrity | 114 0.358 0.191 0 0.971 
Subindex Son preference 123 0.134 0.240 0 1 
Subindex Ownership rights | 122 0.298 0.266 0 1 
Fertility 121 3.562 1.702 0.933 7.678 
Child mortality 119 80.005 67.777 3.758 273.8 
Female secondary school 108 59.210 30.484 6.037 113.275 
Rule of law 123 -0.563 0.718 -2.142 1.658 
Voice and accountability 123 -0.583 0.752 -2.102 1.088 
SA 124 0.056 0.232 0 1 

ECA 124 0.137 0.345 0 1 

LAC 124 0.177 0.384 0 1 
MENA 124 0.145 0.354 0 1 

EAP 124 0.137 0.345 0 1 
Muslim 124 0.33] 0.472 0 1 
Christian 124 0.435 0.498 0 1 

log GDP 115 7.988 1.121 5.609 10.553 
Literacy population 121 0.741 0.218 0.173 1 
Literacy female 106 0.705 0.251 0.128 0.998 
Electoral democracy 120 0.455 0.459 0 1 

FH civil liberties 121 -4.366 1.434 -7 -1.4 
Parliament 118 10.630 6.925 0 29.556 
Aids 116 0.138 0.346 0 1 
Openness 119 0.452 0.261 0.013 1.914 


Ethnic 120 0.517 0.237 0.039 0.930 
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Table 2.3: Pearson Correlation Coefficient between the SIGI and the Subindices 
SIGI Subindex Subindex  Subindex Subindex Subindex 


Family Civil Physical Son Ownership 
SIGI p 1 
Obs. 102 
Subindex Family p 0.79 1 
р-уаше 0.0000 
Obs 102 112 
Subindex Civil p 0.71 0.47 1 
p-value 0.0000 0.0000 
Obs. 102 112 123 
Subindex Physical p 0.66 0.59 0.28 1 
p-value 0.0000 0.0000 0.0025 
Obs 102 103 113 114 
Subindex Son p 0.54 0.18 0.53 0.02 1 
р-уаше 0.0000 0.0594 0.0000 0.8312 
Obs 102 112 122 114 123 
Subindex Ownership p 0.74 0.75 0.36 0.51 0.13 1 


p-value 0.0000 0.0000 0.0001 0.0000 0.1504 
Obs. 102 111 121 112 121 122 
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Table 2.4: Correlation of the SIGI and the Subindices with the 


Control Variables 
SIGI 
p -0.34 
p-value | 0.0005 
Obs. 98 
p 0.50 
p-value | 0.0000 
Obs. 102 
p -0.39 
p-value | 0.0001 
Obs. 102 
p 0.30 
p-value | 0.0023 
Obs. 102 
p -0.32 
p-value | 0.0012 
Obs. 102 
p -0.42 
p-value | 0.0000 
Obs. 102 
p 0.23 
p-value | 0.0196 
Obs. 102 
p -0.19 
p-value | 0.0505 
Obs. 102 


Subindex 
Family 


-0.39 


0.0000 
108 


0.42 
0.0000 
112 


-0.33 
0.0003 
112 


0.13 
0.1589 
112 


-0.38 
0.0000 
112 


-0.47 
0.0000 
112 


0.16 
0.0843 
112 


-0.29 
0.0017 
112 


Subindex 
Civil 


0.20 
0.0362 
114 


0.57 
0.0000 
123 


-0.40 
0.0000 
123 


0.33 
0.0002 
123 


-0.25 
0.0057 
123 


-0.29 
0.0012 
123 


0.53 
0.0000 
123 


-0.11 
0.2205 
123 
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Subindex 
Physical 


-0.47 
0.0000 
105 


0.40 
0.0000 
114 


-0.27 
0.0036 
114 


-0.13 
0.1652 
114 


-0.17 
0.0762 
114 


-0.36 
0.0001 
114 


0.08 
0.3796 
114 


-0.15 
0.1127 
114 


Subindex 
Son 


0.16 
0.0948 
114 


0.36 
0.0000 
123 


-0.37 
0.0000 
123 


0.49 
0.0000 
123 


-0.17 
0.0659 
123 


-0.24 
0.0076 
123 


0.42 
0.0000 
123 


0.10 
0.2934 
123 


Subindex 
Ownership 


-0.48 
0.0000 
114 


0.23 
0.0122 
122 


-0.05 
0.5662 
122 


0.14 
0.1319 
122 


-0.33 
0.0002 
122 


-0.35 
0.0001 
122 


0.02 
0.8501 
122 


-0.28 
0.0016 
122 
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Table 2.4 — continued from previous page 
SIGI Subindex Subindex  Subindex  Subindex Subindex 
Civil 


Son Ownership 


Family Physical 
Electoral democ. p 
0.0001 0.0000 0.0001 0.0001 0.0179 0.0091 


101 110 119 111 119 119 


FH civil libert. p -0.44 -0.30 -0.42 -0.42 -0.28 -0.25 
p-value | 0.0000 0.0016 0.0000 0.0000 0.0021 0.0056 
Obs. 101 110 120 112 120 120 
Parliament p -0.15 -0.15 -0.28 -0.18 -0.17 -0.11 
p-value | 0.1514 0.1202 0.0023 0.0578 0.0750 0.2611 
Obs. 100 109 117 110 118 117 
Literacy pop. p -0.66 -0.70 -0.19 -0.59 -0.25 -0.59 
p-value | 0.0000 0.0000 0.0389 0.0000 0.0054 0.0000 
Obs. 102 112 120 112 121 119 


Literacy female 
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Table 2.5: Linear regressions with dependent variable female secondary school 


Specification with SIGI (1) (2) | Specification with Subindex (3) 


b/se b/se b/se 

SIGI -141.77*** -10.91 | Subindex family code -39.10** 
(37.31) (36.37) (11.64) 

log GDP 12.69*** | log GDP 11.46*** 
(3.39) (2.61) 

Muslim -2.21 | Muslim 3.43 
(5.47) (4.84) 
Christian 5.31 | Christian 4.18 
(5.48) (4.33) 

SA 16.05 | SA 12.3 
(8.75) (8.44) 

ECA 40.26*** | ECA 28.25*** 
(8.98) (6.95) 
ГАС 18.33* | LAC 8.64 
(9.07) (7.41) 

MENA 33.86** | MENA 29.67** 
(12.50) (9.69) 

EAP 24.73** | EAP 14.36* 
(8.26) (6.53) 
Electoral democracy 8.11 | Electoral democracy 6.19 
(7.67) (6.84) 
FH civil liberties 1.95 | FH civil liberties 2.72 
(3.56) (2.89) 
constant 74.75*** -56.71 | constant -27.87 
(4.12) (37.27) (30.56) 
Number of obs. 94 91 | Number of obs. 99 
Adj. R-Square 0.28 0.75 | Adj. R-Square 0.78 
Prob>F 0.0003 0.0000 | Prob>F 0.0000 


* p < 0.05, ** p < 0.01, *** р < 0.001. НСЗ robust standard error in brackets. 
Regression (2) and (3) with controls for economic development, geography, religion and 
political system. In this case, this specification corresponds to the complete specification. 
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Table 2.6: Linear regressions with dependent variable fertility 


Specification with SIGI 


SIGI 

log GDP 

Muslim 

Christian 

SA 

ECA 

LAC 

MENA 

EAP 

Electoral democracy 


FH civil liberties 


constant 


Number of obs. 
Adj. R-Square 
Prob>F 


* p < 0.05, ** p < 0.01, *** p < 0.001. HC3 robust standard error in brackets. 


(1) (2) 
b/se b/se 


8.25*** 1.73 
(23) (2.61) 
-0.71*** 
(0.16) 
0.52 
(0.27) 
0.25 
(0.26) 
-1.89*** 
(0.37) 
-2.44*** 
(0.48) 
-0.96* 
(0.47) 
-1.42* 
(0.63) 
-].74*** 
(0.42) 
-0.2 
(0.31) 
-0.02 
(0.17) 


2.55*** 09.76*** 
(025) (1.82) 


100 97 
0.31 0.82 
0.0006 0.0000 


Specification with Subindex 


Subindex family code 


log GDP 
Muslim 


Christian 


Electoral democracy 
FH civil liberties 
Literacy female 
Aids 

constant 


Number of obs. 
Adj. R-Square 
Prob>F 


(3) 
b/se 


1.89** 
(0.70) 
-0.60*** 
(0.12) 
0.34 
(0.27) 
0.24 
(0.25) 
-1.73*** 
(0.41) 
-2.08*** 
(0.38) 
-0.68 
(0.36) 
-1.07* 
(0.50) 
-1.37*** 
(0.39) 
0.02 
(0.29) 
-0.11 
(0.13) 


7.89*** 
(1.30) 


106 
0.80 
0.0000 


Regression (2) and (3) with minimum of controls for economic development, geography, religion and 


political system. Regression (4) with complete specification for fertility. 
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Table 2.7: Linear regressions with dependent variable child mortality 


Specification with SIGI (1) (2) | Specification with Subindex (3) (4) 


b/se b/se b/se b/se 

SIGI 318.56** 50.42 | Subindex family code 80.14** 77.23* 
(108.81) (150.58) (25.85) (31.50) 

log СОР -22.55** | log СОР -2024*** = .13.82** 
(7.35) (5.34) (5.09) 

Muslim 26.61 | Muslim 14.23 5.74 
(14.13) (13.13) (14.50) 

Christian 7.49 | Christian 9.47 14.27 
(11.72) (10.31) (10.81) 

SA -68.33*** | SA -61.30***  -7].03*** 
(18.87) (17.05) (16.33) 

ECA -85.65*** | ECA -66.13*** -53.16* 
(23.82) (16.75) (20.65) 

LAC -66.65** | LAC -50.60*** . .50.23** 
(23.84) (14.88) (18.89) 

MENA -97.73*** | MENA -86.25***  -93,71*** 
(26.90) (21.71) (23.48) 

EAP -73.44*** | EAP -59.37*** = -55.65** 
(17.23) (15.02) (17.85) 

Electoral democracy -0.79 | Electoral democracy 7.05 1.75 
(15.86) (15.96) (14.80) 

FH civil liberties -4.54 | FH civil liberties -8.33 -8.32 
(7.86) (6.65) (6.44) 

Literacy female -62.77** 

(21.39) 

Aids -19.02 

(14.56) 

constant 43.38*** = 272.39** | constant 209.47** 209.34** 
(10.80) (93.09) (66.26) (63.27) 

Number of obs. 99 97 | Number of obs. 106 99 
Adj. R-Square 0.28 0.79 | Adj. R-Square 0.79 0.82 
Prob» F 0.0043 0.0000 | Prob>F 0.0000 0.0000 


* p « 0.05, ** p < 0.01, *** p < 0.001. HC3 robust standard error in brackets. 
Regression (2) and (3) with controls for economic development, geography, religion and 
political system. Regression (4) with complete specification for child mortality. 
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Table 2.8: Linear regressions with dependent variable ‘voice and accountability’ 


Specification with SIGI (1) (2) (3) | Specification with Subindex (4) (5) 
b/se b/se b/se b/se b/se 

SIGI -2.60*** = .1.42** -1,59** | Subindex civil liberties -0.61** -0.65** 
(0.50) (0.48) (0.54) (0.23) (0.23) 

log GDP 0.27*** 0.30*** | log GDP asi 0.27*** 
(0.06) (0.06) (0.05) (0.06) 

Muslim 0.18 0.15 | Muslim 0.16 0.21 
(0.13) (0.14) (0.13) (0.14) 
Christian -0.03 -0.04 | Christian -0.05 -0.08 
(0.12) (0.13) (0.12) (0.12) 
SA -0.27 -0.28 | SA -0.12 -0.04 
(0.20) (0.21) (0.18) (0.20) 

ECA -0.64*** -0.56* | ECA -0.52*** -0.57** 
(0.14) (0.22) (0.13) (0.22) 

LAC -0.40* -0.41* | LAC -0.32* -0.31 
(0.17) (0.18) (0.15) (0.16) 

MENA -0.45 -0.47 | MENA -0.27 -0.23 
(0.23) (0.25) (0.19) (0.24) 

EAP -0.30* -0.21 | EAP -0.14 -0.21 
(0.14) (0.21) (0.13) (0.18) 

Electoral democracy 1.10** 1.07*** | Electoral democracy 1.13**  1.14*** 
(0.12) (0.11) (0.10) (0.10) 

Parliament 0.01 | Parliament 0.01 
(0.01) (0.01) 
Literacy population -0.31 | Literacy population 0.24 
(0.42) (0.37) 

Openness -0.07 | Openness 0.23 
(0.36) (0.22) 

Ethnic -0.07 | Ethnic 0.01 
(0.25) (0.23) 

constant -0.23*  -2.80*** -2.77*** | constant -328***  .337*** 
(0.10) (0.45) (0.47) (0.41) (0.39) 
Number of obs. 102 97 95 | Number of obs. 112 108 
Adj. R-Square 0.18 0.69 0.69 | Adj. R-Square 0.68 0.69 
Prob>F 0.0000 0.0000 0.0000 | Prob>F 0.0000 0.0000 


* p < 0.05, ** p < 0.01, *** p < 0.001. НСЗ robust standard error in brackets. 


Regression (2) and (4) with controls for economic development, geography, religion and political system. 
Regressions (3) and (5) with complete specification for governance/voice and accountability. 


Table 2.9: Linear regressions with dependent variable ‘rule of law’ 


Specification with SIGI 
(1) (2) 
b/se b/se 
SIGI -1.73*** — .1.88*** 
(0.49) (0.53) 
log GDP 0.41*** 
(0.08) 
Muslim 0 
(0.16) 
Christian -0.18 
(0.15) 
SA 0.18 
(0.22) 
ECA -0.84*** 
(0.18) 
LAC -0.74*** 
(0.19) 
MENA -0.14 
(0.27) 
EAP -0.31 
(0.16) 
Electoral democracy 0.33* 
(0.14) 
Parliament 
Literacy population 
Openess 
Ethnic 
constant -9.35*** -337%%* 
(0.10) (0.58) 
Number of obs. 102 97 
Adj. R-Square 0.09 0.49 
Prob>F 0.0006 0.0000 


(3) 
b/se 


-1.33* 
(0.60) 
0.36*** 
(0.07) 
-0.04 
(0.16) 
0.18 
(0.14) 
0.26 
(0.24) 
-0.67* 
(0.27) 
-0.54* 
(0.21) 
0.17 
(0.32) 
-0.28 
(0.23) 
0.40** 
(0.13) 
0.01 
(0.01) 
-0.29 
(0.42) 
0.69* 
(0.33) 
-0.07 
(0.32) 
3.32999 
(0.52) 


95 
0.51 
0.0000 


Subindex ownership rights 


log GDP 


Muslim 


Electoral democracy 
Parliament 

Literacy population 
Openess 

Ethnic 

constant 


Number of obs. 
Adj. R-Square 
Prob>F 


Specification with Subindices 
(4) (5) 
b/se b/se 
-0.89*** -0.71** | Subindex civil liberties 
(0.20) (0.23) 
0.37***  0,30*** | log GDP 
(0.08) (0.07) 
-0.03 -0.02 | Muslim 
(0.13) (0.14) 
-0.11 -0.14 | Christian 
(0.14) (0.13) 
0.11 0.21 | SA 
(0.17) (0.20) 
-0.93***  .0.83*** | ECA 
(0.16) (0.22) 
-0.78*** — .0.61** | LAC 
(0.19) (0.19) 
-0.09 0.18 | MENA 
(0.25) (0.29) 
-0.35* -0.36 | EAP 
(0.15) (0.20) 
0.38** 0.44*** | Electoral democracy 
(0.11) (0.11) 
0.01 | Parliament 
(0.01) 
-0.03 | Literacy population 
(0.38) 
0.71** | Openess 
(0.27) 
-0.12 | Ethnic 
(0.28) 
-3.06*** — -2.94*** | constant 
(0.56) (0.53) 
112 108 | Number of obs. 
0.53 0.56 | Adj. R-Square 
0.0000 0.0000 | Prob>F 


(6) 
b/se 


-0.75** 


(0.13) 


-4.05*** 
(0.52) 


112 
0.52 
0.0000 


(7) 
b/se 


-0.63* 
(0.25) 
0.36*** 
(0.07) 
0.11 
(0.14) 
0.22 
(0.13) 
0.44 
(0.26) 
-9.74** 
(0.22) 
-0.51** 
(0.18) 
0.30 
(0.28) 
-0.23 
(0.20) 
0.46*** 
(0.12) 
0.01 
(0.01) 
0.20 
(0.36) 
0.73%* 
(0.23) 
-0.13 
(0.27) 
-3.83*** 
(0.46) 


108 
0.56 
0.0000 


* p < 0.05, ** p < 0.01, *** p < 0.001. НСЗ robust standard error in brackets. Regression (2), (4) and (6) with controls for economic development, geography, 
religion and political system. Regressions (3), (5) and (7) with complete specification for governance/rule of law. 
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Essay 3 


Reexamining the link between gender 
and corruption: The role of social 


institutions 


Abstract 


In this paper we reexamine the link between gender inequality and corruption. We review 
the literature on the relationship between representation of women in economic and political 
life, democracy and corruption, and bring in a new previously omitted variable that captures 
the level of discrimination against women in a society: social institutions related to gender 
inequality. Using a sample of developing countries we regress corruption on the represen- 
tation of women, democracy and other control variables. Then we add the subindex civil 
liberties from the OECD Gender, Institutions and Development Database as the measure of 
social institutions related to gender inequality. The results show that corruption is higher in 
countries where social institutions deprive women of their freedom to participate in social 
life, even accounting for democracy and representation of women in political and economic 
life as well as for other variables. Our findings suggest that, in a context where social values 
disadvantage women, it might not be enough to push democratic reforms and to increase the 
participation of women to reduce corruption. 


Based on joint work with Maria Ziegler. 
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3.1 Introduction 


Is there a link between gender inequality and corruption in a society? The studies of Swamy 
et al. (2001) and Dollar et al. (2001) suggest that countries with greater representation of 
women in political and economic life tend to have lower levels of corruption. How can this 
relationship be explained? 

This could be attributed to behavioral differences between men and women. As mentioned 
by Dollar et al. (2001), there are experimental studies and studies using survey data that find 
that, on average, women are less selfish and might have higher moral and ethical standards 
than men (e.g. Eagly and Crowley, 1986; Glover et al., 1997; Eckel and Grossman, 1998; 
Rivas, 2008).! If one accepts that women are less selfish and align their actions on higher 
moral standards than men, having women in important political and economic positions 
might lead to less corruption in a country. 

An alternative explanation is put forward by Swamy et al. (2001), who argue that the 
negative relationship between women's participation and corruption could be due to self- 
selection. Only a few women reach powerful positions, and these women possibly gain 
access to these positions as they are from the "better" part of the women's distribution. 

From a historical perspective, Goetz (2007) claims that it is gendered access to political 
positions that explains why women seem to be less corrupt than men. Excluded from male 
patronage networks, women are restricted in their opportunities for corrupt behavior. As they 
are newcomers or only few in the political or business sphere, women lack familiarity with 
the rules of illicit exchange to their own benefit. They try to assert their position by acting 
honestly and trustworthily. This all leads to fewer corrupt activities by women, but as time 
passes and more women get access to power this effect might vanish. 

It can also be argued that the observed relationship between women's representation and 
corruption is spurious. Swamy et al. (2001) and Dollar et al. (2001) warn that even if one 
controls for other factors in the regression, the observed relationship at the cross-country 
level could be due to some unobserved variable which influences both female representation 
and corruption. For example, according to Sung (2003) it might be the political system in the 
form of liberal democratic institutions that influences both. Sung (2003) argues that institu- 
tions of liberal democracy increase women's participation in government through values like 
equality, pluralism, fairness and tolerance. Competitive elections, an independent judiciary 
and a free press, which are elementary to a liberal democratic system, guarantee transparency 


!There are empirical studies that challenge the finding that women are the “fairer sex” (e.g. Andreoni and 
Vesterlund, 2001; Alhassan-Alolo, 2007; Alatas et al., 2009). Another investigation highlights that when 
women are in a powerful position, they take decisions that are closely related to women's needs (Chattopad- 
hyay and Duflo, 2004). 
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and hold government officials accountable, thereby reducing corruption. Therefore, the neg- 
ative effect of women's representation in government on corruption is spurious and vanishes 
when one includes a measure of democracy in the regression, which is empirically confirmed 
by Sung (2003). Swamy et al. (2001) draw attention to the “level of discrimination against 
women" as another possible omitted variable that drives both female participation and cor- 
ruption. They claim that in countries that are more corrupt there is more discrimination 
against women and argue that in countries where traditions and clientelism prevail, there is 
a preference for men in power. 

In this paper, we focus on the effect of discrimination against women on corruption in a 
society as we have a new measure of society's attitude towards gender inequality to empiri- 
cally test this relationship. Swamy et al. (2001) do not explain how this relationship operates, 
but several papers deal with this issue in a direct or indirect way (Tripp, 2001; Inglehart et al., 
2002; Rizzo et al., 2007). These authors claim that society's attitude towards women influ- 
ences how a political system functions and that it affects the positions women take in this 
system. Assuming that the level of corruption depends on the functioning of the political 
system, one could argue that society's attitude towards gender inequality has an impact on 
corruption. 

The study of Tripp (2001) focuses on women's movements as a countervailing force to 
prevailing practices of corruption in Eastern and South Africa? Political reforms at the 
beginning of the 1990s, including free and competitive elections, a multi-party system and 
freedom of expression and association were not enough to give women access to powerful 
positions and to curtail the practices of patronage, clientelism and personal rule. Women 
could enter the system, but they were excluded from male-dominated networks and therefore 
from the benefits of clientelism. However, political reforms allowed the formation of social 
forces. The disadvantaged women organized in autonomous movements, which were broad- 
based, multi-ethnic and multi-religious. These movements cross-cut cleavages and started to 
demand transparency and the removal of clientelistic networks. 

A similar perspective is adopted by Inglehart et al. (2002) and Rizzo et al. (2007) who state 
that when a society favors gender equality, there is more tolerance in general, more personal 
freedom and individual autonomy. The absence of these values inhibits political reforms 
towards a democratic system. The study of Inglehart et al. (2002) finds that gender equality 
is the most important part of "self-expression values" appearing in post-industrialization so- 
cieties which directly contribute to both democratization and to a greater representation of 
women in politics. Focusing on Arab and non-Arab Muslim countries, Rizzo et al. (2007) 
shows that even if democratic political institutions like elections, political parties or checks 


2Waylen (1993) makes a similar point for Latin America. 
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and balances are put in place, gender inequality can prevent these institutions from function- 


ing well. 


We empirically test on a sample of developing countries the relationship between social 
institutions related to gender inequality and the level of corruption, and contribute to the 
literature discussed above. We focus on public corruption, which refers to the misuse of 
public office for private gain. It comprises grand corruption, which refers to activities of 
top officials and big companies, and petty corruption, which refers to the activities of people 
at the lower end of hierarchies (Pardo, 2004). To proxy society's attitude towards gender 
inequality or what Swamy et al. (2001) call “level of discrimination against women" we in- 
troduce social institutions related to gender inequality into the analysis. Social institutions 
are long-lasting norms, traditions and codes of conduct that shape gender roles and influence 
the opportunities of women and men in a society. As suggested by e.g. De Soysa and Jütting 
(2007) and in Essay 2, these guiding principles of human behavior affect development out- 
comes and should not be neglected in the study of a society. We measure social institutions 
related to gender inequality with the subindex civil liberties proposed by Essay 1, which is 
based on variables from the OECD Gender, Institutions and Development Database (Jütting 
et al., 2008). This subindex captures society's attitude with regard to gender roles based on 
the freedom of women to participate in social life. 


Our aim is to investigate whether society's attitude towards gender inequality matters for 
corruption once one accounts for the representation of women in parliament and business as 
well as the political system of a country. The hypothesis is that in a society where women's 
participation in social life is restricted, there is a higher level of corruption. 


Even after controlling for democracy and political and economic participation of women, 
as well as for other factors, we find a robust and significant relationship between the subindex 
civil liberties and the level of corruption. We show that social institutions related to gender 
inequality are an important factor for the study of corruption. In societies where women are 
deprived of their freedoms to participate in social life, corruption is higher. As should be clear 
from the various existing theories the exact causal mechanism behind this relationship is not 
obvious and it cannot be established in this study since we conduct a cross-sectional analy- 
sis. This implies that one needs to carefully investigate the context, as tackling corruption 
might require more than pushing democratic reforms and increasing female representation 


in political and economic positions. 


The rest of the paper is organized as follows. Section 3.2 describes the the data used, the 


empirical estimation and the main results, which are discussed in section 3.3. 
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32 Empirical Estimation and Results 


3.2.1 Data 


The definition of all variables and descriptive statistics are presented in Tables 3.1, 3.2 and 
3.3. Measuring corruption is a complex task as it has many faces. There is public corruption, 
which refers to the misuse of public office for private gain, and corruption that comprises 
the collusion between firms or misuse of corporate assets (Svensson, 2005). Other authors 
differentiate between grand and petty corruption. Grand corruption refers to activities of top- 
officials and big companies. Petty corruption refers to the activities of people at the lower 
end of hierarchies (Pardo, 2004). 

We use two different measures of public corruption in our estimations comprising grand 
and petty corruption. The first measure is the Corruption Perception Index (CPI) of Trans- 
parency International) The CPI measures the level of corruption in a country. It is based 
on various data sources, business surveys and expert panels about perceptions of corruption, 
and is a comprehensive measure that covers the different forms of grand and petty corrup- 
tion in business, politics and administration. It is continuous and ranges from 0 meaning 
high corruption to 10 meaning low corruption (Lambsdorff, 2006). 

The second indicator is the Corruption in Government Index from the International Coun- 
try Risk Guide (ICRG) provided by the Political Risk Services.4 The ICRG index assesses 
the political risk associated with corruption and focuses in particular on those types of cor- 
ruption that lead to instability in the political system as they distort the economic and finan- 
cial environment, put foreign investments into risk and reduce the efficiency of government 
and business because people come to power not because of their ability but through patron- 
age and clientelistic practices.” Hence, this measure gives the extent of political risk of 
instability that is assumed to increase with corruption. Therefore, it is only under certain 
conditions an indicator of the level of corruption. Whether the political risk of instability 
caused by corruption coincides with the level of corruption depends on the degree of toler- 
ance towards corruption (Lambsdorff, 2006). The ICRG corruption index is a continuous 
variable that ranges from 0 to 6 with 0 meaning high risk and 6 indicating low risk. 

The subindex civil liberties (Subindex Civil lib.) is one of five composite indices (the 
others being subindex family code, subindex son preference, subindex physical integrity, 
subindex ownership rights) that measure social institutions related to gender inequality, and 
were presented in Essay 1. These social institutions are conceived as long-lasting norms, 
Data are available at http://www.transparency.org/policy research/surveys 

indices/cpi. 


http: //www.prsgroup.com/. 
Shttp: //www.prsgroup.com/ICRG_Methodology.aspx\#PolRiskRating 
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traditions and codes-of conduct that find expression in traditions, customs and cultural prac- 
tices, informal and formal laws and guide people's behavior and interaction. They shape 
gender roles and therefore the social and economic opportunities of men and women. We 
use the subindex civil liberties in this study as it covers those social institutions that directly 
shape the opportunities of women to participate in social life. It therefore reflects better their 
opportunities to gain power in politics and economics than the other subindices related to 
gender inequality. Indeed, we find that the subindex civil liberties is the only subindex that 
is significant in the regression analysis. The subindex civil liberties is built out of two vari- 
ables of the OECD Gender, Institutions and Development Database (Morrison and Jütting, 
2005; Jütting et al., 2008), which are freedom of movement and freedom of dress. Freedom 
of movement measures the level of restrictions women face in moving freely outside their 
own household. Freedom of dress measures the extent to which women are obliged to follow 
a certain dress code in public, for example being obliged to cover their face or body when 
leaving the house. Both variables are ordinal taking the values 0, 0.5 and 1 with 0 indicating 
no restrictions and 1 indicating high restrictions on women. They are proxies of civil lib- 
erties in a sense that when women are restrained to leave the house it is difficult to imagine 
that they can actively participate in social, political and economic life. Wearing a veil might 
be a form of self-determination and expression, and different traditions, styles and customs 
are connected to it. However, forced veiling is incompatible with agency, as it might be a 
sign of subordination in a society and might hinder interactions with other human beings - 
either as women cannot interact because they wear a veil or they can only interact if they 
wear a veil (Macdonald, 2006; Milallos, 2007). The subindex is the rescaled weighted sum 
of the two variables with the weights obtained from polychoric principal component analy- 
sis (Kolenikov and Angeles, 2009). The subindex goes from 0 (no gender inequality) to 1 
(high gender inequality). As the subindex civil liberties does not cover developed (OECD) 


countries, the subsequent empirical analysis focuses on developing countries. 


The variables that are contained in the subindex could be considered as proxies for re- 
ligion and therefore one could think that the subindex civil liberties might be a proxy for 
religion as well. When investigating the variation of the subindex over religion, one ob- 
serves that there is more variation within Muslim majority countries than in countries with 


6The variable freedom of dress takes the value 0 if less than 50% of women are obliged to follow a certain dress 
code, 0.5 if more than 5096 of women are forced to follow a certain dress code, and 1 if all women are obliged 
to follow a certain dress code, or if it is punishable by law not to follow it. The variable freedom of movement 
is 0 if less than 50% of women face restrictions on their movement outside the home, 0.5 if more than 50% of 
women face restrictions on their movement outside the home, and 1 if women can never leave home without 


restrictions (i.e. they need a male companion). 
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either Christian majority or countries without Christian or Muslim majority (Table 3.5). To 
further examine whether the subindex measures Muslim religion, we plot the subindex civil 
liberties against the percentage of Muslim population in a country (Figure 3.1). It is true 
that countries having less than 50% Muslim population tend to have lower values on the 
subindex civil liberties with the exception of India which scores 0.6 with about 1596 of Mus- 
lim population. For countries with more than 50% Muslim population the subindex shows 
more variation. Noticeably, there are several countries that have more than 7096 of Muslim 
population and the value 0 on the subindex civil liberties.” Consequently, there is no perfect 
correspondence between the subindex and the percentage of Muslim population. Neverthe- 
less, in the regressions we include a Muslim and a Christian dummy (Muslim and Christian) 
to control for the impact of religion, the left-out category being countries that have neither a 
majority of Muslim nor a majority of Christian population.? 

To account for female representation, which is highlighted by e.g. Swamy et al. (2001) 
and Dollar et al. (2001), we include three measures of female representation. We take data 
from World Bank (20092) on the proportion of female legislators (Parliament), the female 
share in professional, technical, administrative and managerial positions (Managers),! and 
women's share of labor force (Labor force). 


To capture democracy we choose the Electoral Democracy index (Electoral democ.) of 
Freedom House (2008) that takes the value 1 if there are competitive, universal, free and 
secret elections and a multiparty system. An alternative measure is the Polity2 index of 
the Polity IV Project that we use to check the robustness of the results as Polity2 measures 
more closely liberal democracy (Marshall and Jaggers, 2009).!! Unfortunately, it covers 


7The variable freedom of movement varies over all three religious categories, while the variable freedom of 
dress has almost no variation in countries having a Christian majority or countries without Christian or Muslim 
majority, except for India and Sri Lanka. 

8 Albania, Azerbaijan, Gambia, Guinea, Kyrgyz Republic, Mali, Morocco, Niger, Senegal, Sierra Leone, Tajik- 
istan, Tunisia, Turkmenistan, Uzbekistan 

? As Muslim religion is related to the subindex we also use the percentage of Muslim population instead of the 


two religion dummies in the regressions. The results are unchanged. 
10Both indicators have been criticized (Bardhan and Klasen, 1999; Dijkstra, 2002). In some countries, for ex- 


ample communist ones, parliaments lack power and the representation of women in these parliaments does 
not reflect actual power of women. Moreover, female representation in parliament measures only representa- 
tion at the national level and ignores women’s participation at other levels of the state and in civil society. A 
similar problem is attached to the representation of women in senior economic positions that only measures 
formal sectors. In addition, this indicator does not fluctuate much over years. However, given that there 
is a lack of data available for women's representation at the local and societal level as well as for informal 


economic participation and to be comparable to other studies, we use both measures. 
11 Current data for ће Polity IV Project can be found at 


http://www.systemicpeace.org/polity/polity4.htm. 
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fewer countries than the Electoral democracy index.!? Dollar et al. (2001), Swamy et al. 
(2001) and Sung (2003) use either the Civil Liberties index!3, the Political Rights index or 
the Freedom of the Press index of the Freedom House project as regressors in their empirical 
analysis to measure or to refine the measurement of democracy. It needs to be stressed that 
these measures are not without methodological problems as they include questions about 
bribing and other forms of corrupt behavior and are therefore by construction correlated with 
corruption. The Civil Liberties index includes questions on corruption that restrains free and 
independent media. The Political Rights index includes questions related to corruption in 
government. The Freedom of the Press index includes questions on the impact of corruption 
and bribery on content of the press. Moreover, Sung (2003) uses a rule of law index that is 
also problematic as rule of law is closely related to the prevalence of corruption. Therefore, 
only the Electoral Democracy index of Freedom House (2008) is included in our regressions 
to account for democracy. 
As additional controls we include: 


e thelog of GDP per capita in constant prices to control for the level of economic devel- 
opment as combatting corruption might be costly, and as poorer people might tend to 
engage more in corrupt activities (log GDP) !^ (Swamy et al., 2001); 


e region dummies to capture geography and other unexplained regional heterogeneity, 
with Subsaharan Africa as the reference category (SA for South Asia, ECA for Eu- 
rope and Central Asia, LAC for Latin America and Caribbean, EAP for East Asia and 
Pacific); 


e ethnic fractionalization as it might increase corruption through clientelistic networks, 
identity politics and patronage along ethnic lines (e.g. Tripp, 2001) (Ethnic frac.); 


e literacy rates to control for the knowledge of the population about laws against corrup- 
tion, and as higher education might come along with less tolerance towards corruption 
(Swamy et al., 2001) (Literacy pop.); 


e a measure of trade openness as trade barriers increase the incentives for corrupt be- 
havior between individuals and customs officials (Ades and Tella, 1997; Gatti, 2004) 


(Openness); 


I2We use averages over ten years to capture stability of democracy. For the 121 countries for which both 
Electoral democracy and Polity2 are available, the Pearson Correlation Coefficient between them is 0.90 and 
significant. 

\3The Civil liberties index from Freedom House (2008) measures civil liberties in general and is not to be 
mixed up with the subindex civil liberties related to gender inequality. 

I^USS$, PPP, base year: 2005. 
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e a dummy indicating whether a country has never been a colony (Not colony) and a 
dummy measuring whether a country was a British colony (British colony) based on 
Correlates of War 2 Project (2003) as corruption might also be linked to the history of 
colonialism (Swamy et al., 2001). 


The subindex civil liberties reflects the information available around the year 2000 and is 
not expected to change rapidly over time as social institutions are long-lasting and change 
only slowly and incrementally. For this reason, we use in the case of all other variables 
averages of the existing values over time to minimize the loss of observations due to missing 
values and to obtain a more stable value for the indicators used. For the corruption indicators 
representing our response variables we take averages over the years 2001 to 2005 for the 
CPI and over the period 2000-2004 in the case of the ICRG. For the other regressors we 
use averages over ten years (1996-2005), with the exception of ethnic fractionalization as 
changes in the ethnic composition of a country in less than 20 years are rare (Alesina et al., 
2003). Concerning the two democracy variables, choosing averages over ten years has the 
advantage of capturing the stability of a democratic system, which has been highlighted by 
Treisman (2007) as important for corruption. In addition, having a difference of five years 
between response variable and the regressors might help to alleviate endogeneity and capture 
delays until possible effects can be observed. 


3.2.2 Empirical Estimation 


We empirically test with multiple linear regressions whether the subindex civil liberties s;, 
which measures the freedom of social participation of women, is correlated with a response 
variable y; capturing the level of corruption, after controlling for other factors that have 
been described in the literature as possible determinants of corruption.? As was discussed 
previously, we consider that social institutions related to gender inequality are relatively 
stable and long lasting. Therefore, we assume that they do not depend on the response 
variable for the period considered.!® 


We run regressions as 


yi = а + Bs; + control variables; + e; (3.1) 


I5Before conducting the multiple linear regression analysis, we account for the importance of GDP for corrup- 
tion. We first run a simple linear regression of each corruption measure on log GDP. We then compute the 
estimated residuals from this regression and use them as the dependent variable in a new simple linear re- 
gression where the subindex civil liberties is the only regressor. For both CPI and ICRG we obtain a negative 
and significant coefficient for the subindex civil liberties which suggests that the subindex is able to account 


for something that goes beyond GDP when explaining corruption. 
1671 general, social institutions, i.e. normative frameworks, change only slowly and incrementally. 
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using information at the country level. We are mainly interested in testing the null hypothesis 
that coefficient В is zero at a statistical significance level of 10%. The control variables 
included to attenuate omitted variable bias are described in Table 3.1. We acknowledge, 
however, that it is impossible to entirely rule out this problem. 


To reproduce the findings from the literature, we first run a regression without the subindex 
civil liberties to focus on the effects of democracy and representation of women, which have 
been largely discussed. In a second step, we add to the regressions the subindex civil liber- 
ties as a measure of society's attitude towards gender inequality, as it can be argued that it 
is a variable that has been omitted in the previous regressions (Swamy et al., 2001). We run 
each specification for the two measures of corruption and use each time one of the two alter- 
native measures of democracy. At the end, we present four regressions for each corruption 


indicator. 


Preliminary regressions not reported here suggest that heteroscedasticity is a possible 
issue in our data and that there are influential observations that could drive the results. If our 
model is well specified, the OLS estimator of the regression parameters remains unbiased in 
the presence of heteroscedasticity, but the estimator of the covariance matrix of the parameter 
estimates can be biased and inconsistent, making inference about the estimated regression 
parameters problematic. Violations of homoscedasticity can lead to hypothesis tests that 
are not valid and confidence intervals that are either too narrow or too wide. To deal with 
heteroscedasticity, we run the regressions with OLS and *heteroscedasticity-consistent' (HC) 
standard errors. As our sample sizes are less than 150, we use HC3 robust standard errors 
proposed by Davidson and MacKinnon (1993), which are better with small samples.!? 


For all the regressions, we check whether the results concerning the subindex civil liber- 
ties are stable in three ways. First, it is clear that in the multiple regressions, the estimate 
of the effect of our main variable, the subindex civil liberties, depends on the values of the 
other explanatory variables included (Mukherjee et al., 1998). We also try a simpler model 
to confirm that the estimated coefficient of the subindex civil liberties is negative and statis- 
tically significant. In this smaller model and based on the arguments presented before, we 


include as additional regressors the variables capturing the representation of women in so- 


Simulation studies by Long and Ervin (2000) have shown that HC standard error estimates tend to maintain 
test size closer to the nominal alpha level in the presence of heteroscedasticity than OLS standard error 
estimates that assume homoscedasticity. These authors recommend the use of HC3 robust standard errors, 
especially for sample sizes less than 250, as they can keep the test size at the nominal level regardless of the 
presence or absence of heteroscedasticity, with only a minor loss of power associated when the errors are 
indeed homoscedastic. We acknowledge that heteroscedasticity-consistent standard errors are not a panacea 
for inferential problems under heteroscedasticity. As pointed out by some authors, there are limitations and 
trade-offs in these estimators (e.g. Kauermann and Carroll, 2001; Wilcox, 2001). 
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ciety, a measure of democracy, the log GDP, religion dummies and regional dummies. This 
has the advantage that less parameters have to be estimated with the available observations. 

Second, we use bootstrap with 1000 replications to compute a Bias-corrected and accel- 
erated (Bca) 90% confidence interval of the regression coefficients computed with OLS to 
confirm that the value zero is not contained in the confidence interval around ß (Efron and 
Tibshirani, 1993). One of the main advantages of bootstrapping methods is that one does 
not make any assumptions about the sampling distribution or about the statistic. Third, we 
detect observations with high influence or leverage based on the first estimates (OLS with 
standard variance estimator) using Cook's distance. Cook's distance is a commonly used 
estimate of the influence of a data point when doing least squares regression, and it measures 
the effect of deleting a given observation. We exclude the countries identified as outliers 
from the sample if the value of Cook's distance is larger than 4/n, with n being the number 
of observations, and re-estimate regression 3.1 on the restricted sample using HC3 robust 
standard errors. 

One should consider that possible endogeneity of the regressor s; (the subindex civil lib- 
erties), meaning that s; is correlated with the error term e; in the regression, might lead to an 
estimated coefficient of s; that is biased. Endogeneity might arise due to omitted variables, 
measurement error and simultaneity (Wooldridge, 2002). The control variables included in 
the regression aim at minimizing omitted variable bias, albeit one cannot rule out this prob- 
lem. We do not find it plausible that there are measurement errors in s; which are related to 
the unobserved *true' social institutions. Simultaneity could arise if s; is determined simul- 
taneously with the dependent variable yj. As was discussed previously, social institutions 
related to gender inequality s; are relatively stable and long-lasting. Hence, it is unlikely that 
the response variable y; influences s;. 


3.2.3 Results 


Results for the CPI as the first measure of corruption are presented in Table 3.6. Specifi- 
cations (1) and (2) do not include the subindex civil liberties. In both specifications, none 
of the democracy variables Electoral democracy and Polity2 are significant. From the three 
measures of representation of women only Parliament is significant and positively related to 
corruption in specification (1) where Electoral democracy is the measure of democracy. Of 
the control variables only GDP has a significant and positive coefficient. In specifications (3) 
and (4) the subindex civil liberties is added as a new regressor to the former specifications. 
Its coefficient is negative and significant in both. Both democracy variables as well as the 
measures for participation of women in the economy are not significant. Only Parliament 
carries a positive and significant coefficient when Electoral democracy is used (specification 
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(3)). In the same specification (3) two control variables besides log GDP become significant: 
British colony and the regional dummy for ECA. For all four specifications the adjusted R 
square is around 0.5. 

Table 3.7 shows the results when ICRG is used as the measure of corruption. For all 
4 specifications (1-4), none of the variables reflecting representation of women and none 
of the democracy measures is significant. Interestingly, log GDP is also insignificant in all 
specifications, whereas it is always significant when the CPI is used as measure of corruption. 
Openness is the only control variable which is significant in all specifications. Important for 
the results of this paper, the subindex civil liberties is significant in specifications (3) and 
(4), and adding it to the corresponding regressions yields values for adjusted R-square that 
are noticeably larger than without it. It must be noted, however, that the obtained values for 
adjusted R-square for the regressions with the ICRG are lower than for the CPI (between 0.2 
and 0.3 for the ICRG and around 0.5 for the CPI), suggesting that the model is not able to 
explain much of the variation of the political risk of instability due to corruption. 

Using the smaller model yields similar results for the subindex civil liberties and the vari- 
ables measuring representation of women and democracy.!? These findings do also withstand 
the two other robustness checks. First, we confirm with Bias-corrected and accelerated (Bca) 
confidence intervals that in all cases the value zero is not contained in the 90% confidence 
interval around the regression coefficient of the subindex civil liberties. Second, excluding 
outliers (6 to 7 countries) and re-running specifications (3) and (4) for both corruption mea- 
sures, the subindex civil liberties remains significant in all estimations. It is worth mention- 
ing that for every restricted sample, the adjusted R-square is higher than in the corresponding 
complete sample. !? 

Summarizing the results, when we do not include the subindex civil liberties we find that 
from all variables for representation of women only Parliament is significant in the case of 
the CPI as long as Electoral democracy is used as measure of democracy. If one uses Polity2 
instead, Parliament becomes insignificant. None of the democracy measures used tums out 
to be significant. When we include the subindex civil liberties, the results for representation 
of women and the democracy variables stay unchanged. Neither representation of women, 
except Parliament in the case of CPI when Electoral democracy is used, nor the democracy 
variables are significantly related to corruption. The main result concerning the subindex 
civil liberties is that even after controlling for democracy and for measures of political and 


18 With the smaller model we also run three separate regressions for (i) Muslim majority countries, (ii) Christian 
majority countries, and (iii) countries without Christian or Muslim majority. As could be expected, the 
subindex civil liberties is statistically significant only in the regressions with Muslim majority countries, as 


in other countries there is not enough variation in the subindex. 
Results for all the robustness checks are not reported here, but are available upon request. 
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economic participation of women as well as for other factors, we find a robust and significant 
relationship between the subindex civil liberties, which reflects society's attitude towards 
gender inequality, and the level of corruption. Social institutions favoring gender inequality 
are associated with higher levels of corruption. 


3.3 Conclusion 


The literature investigating the link between gender and corruption finds that there is a re- 
lationship between female representation in political and economic life and the level of cor- 
ruption in a country. However, some studies warn that the observed relationship may be 
due to omitted variable bias. A possible variable that might influence both participation of 
women and corruption, is liberal democracy (e.g. Sung, 2003). We introduce a further omit- 
ted variable that has either been neglected in the literature or not been adequately dealt with 
because of insufficient data. Swamy et al. (2001) refer to this as the “level of discrimination 
against women" and proxy it with the gaps in educational attainment and life expectancy 
between men and women. We use the subindex civil liberties, which we consider a better 
proxy of the “level of discrimination against women" as it captures social institutions that 
restrict women in their freedom to participate in public life and reflect society's attitude to- 
wards gender inequality. The subindex measures underlying institutions and not outcomes 
of these institutions as do the variables used by Swamy et al. (2001). 

When we replicate the findings of the literature for our sample of developing countries 
without the social institutions indicator, the results support the hypothesis of Sung (2003) 
and others that, when liberal democracy (in our case measured with Polity2) is considered 
in the regression, the representation of women in political and economic life is insignificant. 
However, Sung's hypothesis is weakened by the fact that there is no statistically significant 
association between democracy and corruption. Consequently, our statistical results support 
neither Sung's arguments nor the arguments put forward by Swamy et al. (2001) and Dollar 
et al. (2001) that representation of women is negatively related to corruption.?? These results 
make it difficult to interpret social institutions related to gender inequality as an omitted 
variable when one investigates the relationship between representation of women in society, 


democracy and corruption?! 


200псе again, our sample includes only developing countries, while the other studies include developed coun- 


tries as well. 
21We have estimated with multivariate regressions, not reported here, whether there is (1) a relationship between 


democracy and the subindex civil liberties and (2) a relationship between representation of women in society 
and the subindex civil liberties in our sample of developing countries, but did not find significant results. 
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Once we include the subindex civil liberties as a regressor, the main finding is that after 
controlling for representation of women in political and economic life and for democracy, it 
has a robust negative and significant relationship with corruption. In countries where social 
institutions inhibit the freedom of women to participate in social life, the level of corruption 
is higher. 

Admittedly, one has to be cautious with these results. Interpretations for these findings in 
the light of the theories discussed are difficult, and country or regional studies are needed. 
Measurement is another relevant issue as the concepts of social institutions, democracy, par- 
ticipation of women and corruption are all hard to operationalize. The measures used in 
this study could be contaminated by measurement error. Finally, it cannot be ruled out that 
another factor, which has been neglected from the analysis, shapes the results. 

Nevertheless, we derive one policy implication from this study, which should be mainly 
targeted at developing countries. In a context where social institutions deprive women of 
the freedom to participate in social life, neither political reforms towards democracy nor 
the representation of women in political and economic positions might be enough to reduce 
corruption. How women are treated in a society is not only important for them, but has major 


implications for the functioning of the whole society. 


3.4 Tables 


Table 3.1: Description and Sources of Variables 


Measures of corruption 
CPI Corruption Perception Index (CPI); Transparency International (TI) 
comprehensive measure of the level of corruption in a country that covers 
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the different forms of grand and petty corruption in business, politics 
and administration. Ranges from 0 (high corruption) to 10 (low corruption). 
(average of existing values over the last five years) 
Corruption in Government Index; International Country Risk Guide (ICRG) 
assesses corruption within the political system and focuses in particular 
on those types of corruption that lead to instability in the political system. 
Ranges from 0 (high risk) to 6 (low risk). 
(average of existing values over the last five years) 

Representation of women 

Parliament Proportion of seats held by women in national parliaments (96) World Bank (20092) 
(average of the existing values over the last 10 years) 

Managers Proportion of professional and technical, administrative and managerial World Bank (20092) 
positions held by women (%) 
(average of the existing values over the last 10 years) 

Labor force Female labor force participation rate World Bank (20092) 
(average of the existing values over the last 10 years) 

Democracy 


Electoral democ. Index that qualifies countries as electoral democracy when there Freedom House (2008) 
exist competitive, universal and free and secret elections and a 


Continued on next page 
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Table 3.1 — continued from previous page 


Definition CT 


multiparty system that can access the media for political 


campaigning, 
(average of the existing values over the last 10 years) 


Measure of democracy taking account of Marshall and Jaggers (2009) 
competitiveness of participation, institutions and procedures 
openness and competitiveness of executive recruitment and 
constraints on the chief executive, 

ranges from -10 (highly autocratic) to 10 (highly democratic) , 
score 0 means country is democratic 

(average of the existing values over the last 10 years) 


Social inst. related to 
gender ineq. 


Subindex civil lib. Subindex Civil liberties that captures the freedom of social participation 


of women 


Branisa et al. (2009a) 


Control variables 
log GDP 


Log of GDP per capita, PPP (constant 2005 international $) 
(average over the last 10 years) 


World Bank (2008) 


SA Countries get a 1 iflocated in region South Asia, 

0 otherwise. 

Countries get a 1 if located in region Europe and Central Asia, 

0 otherwise. 

Countries get a 1 if located in region Latin America and the Caribbean, 
0 otherwise. 

Countries get a 1 if located in region Middle East and North Africa 


0 otherwise. 
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Table 3.1 — continued from previous page 


Countries get a 1 if located in region East Asia and Pacific 
0 otherwise. 


Muslim Countries get a 1 if at least 50 % of the population are Muslim, Central Intelligence Agency (2009) 
0 otherwise. 
Countries get a 1 if at least 50 % of the population are Christian, 


0 otherwise. 


Christian Central Intelligence Agency (2009) 


Muslim percentage 
Ethnic frac. 


percentage of Muslim population 
The ethnic fractionalization measure gives the probability that two 
individuals selected at random from a population are members of 


Pew Forum on Religion and Public Life (2009) 
Alesina et al. (2003) 


different groups. It is calculated with data on language and origin. 


The value 0 means complete homogeneity and 1 complete heterogeneity. 


Literacy pop. Literacy rate for the whole population Human Development Report (HDR) stats office 
(average of the existing values over the last 10 years) 
Imports of goods and services (% of GDP) 


Countries get a 1 if never colonized, 0 otherwise. 


Openess 


Not colony 


World Bank (2008) 
Correlates of War 2 Project (2003) 
Correlates of War 2 Project (2003) 


British colony Countries get a 1 if former British colony, 0 otherwise. 
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Table 3.2: Descriptive statistics of variables used 


Variable N mean sd min max 


Measures of corruption 
CPI 15 317 137 12 932 
ICRG 97 217 074 025 432 


Representation of women 


Parliament 119 10.76 7.03 0.00 29.56 
Managers 120 7.98 5.26 0.00 23.70 
Labor force 122 55.10 16.75 10.96 92.96 
Democracy 

Electoral democ. 121 0.45 046 0.00 1.00 
Polity2 98 109 6.08 -9.00 10.00 


Social inst. related to gender ineq. 


Subindex Civil lib. 124 0.16 0.26 0.00 1.00 
Control Variables 

log GDP 116 7.98 1.12 5.61 10.55 
SA 125 0.06 0.23 0.00 1.00 
ECA 125 0.14 0.34 0.00 1.00 
LAC 125 0.18 0.38 0.00 1.00 
MENA 125 0.14 0.35 0.00 1.00 
EAP 125 0.14 0.35 0.00 1.00 
Muslim 125 0.33 0.47 0.00 1.00 
Christian 125 0.43 0.50 0.00 1.00 
Muslim percentage 121 33.38 39.65 0.00 99.70 
Ethnic frac. 121 0.51 024 0.04 0.93 
Literacy pop. 122 0.74 022 0.17 1.00 
Openness 120 0.445 026 0.01 1.91 
Not colony 121 021 041 0.00 1.00 


British colony 121 0.30 046 0.00 1.00 
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Table 3.3: Pearson Correlation Coefficient between subindex Civil liberties and control vari- 


ables 


log GDP 


SA 


ECA 


LAC 


MENA 


EAP 


Muslim percent. 


p 


p-value 


Number of obs. 


p 
p-value 


Number of obs. 


p 


p-value 


Number of obs. 


p 
p-value 


Number of obs. 


p 


p-value 


Number of obs. 


p 


p-value 


Number of obs. 


p 


p-value 


Number of obs. 


0.20 
0.036 
114 


0.33 
0.000 
123 


-0.25 
0.006 
123 


-0.29 
0.001 
123 


0.53 
0.000 
123 


-0.11 
0.221 
123 


0.54 
0.000 
120 


Muslim 


Christian 


Ethnic 


Literacy population 


Openness 


Not colony 


British colony 


p 


p-value 
Number of obs. 


p 
p-value 


Number of obs. 


p 
p-value 


Number of obs. 


p 
p-value 


Number of obs. 


p 
p-value 


Number of obs. 


p 
p-value 


Number of obs. 


p 
p-value 


Number of obs. 


0.57 
0.000 
123 


-0.40 
0.000 
123 


0.08 
0.392 
119 


-0.19 
0.039 
120 


-0.07 
0.447 
118 


-0.06 
0.549 
119 


0.36 
0.000 
119 


Table 3.4: Pearson Correlation Coefficient (p) between the Corruption Measures 


CPI p 
obs 
ICRG p 


| CPI ICRG 
1 
115 
0.58 1 


p-value | 0.0000 


obs 
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Table 3.5: Variation of the subindex civil liberties over religion 


Subindex No christian/ Christian Muslim Total 


civil Muslim majority majority 
liberties majority 

0 22 46 15 83 
0.298 5 8 1 14 
0.301 1 0 4 5 
0.599 1 0 15 16 
0.781 0 0 2 2 
0.818 0 0 1 1 
1 0 0 2 2 


Тош 29 54 40 123 
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Table 3.6: Linear regressions with dependent variable CPI 


Model 1 
Representation of women 
Parliament 0.031* 
(0.018) 
Managers 0.025 
(0.029) 
Labor force 0.007 
(0.009) 
Democracy 
Electoral democ. 0.339 
(0.234) 
Polity2 
Social inst. related to gender ineq. 
Subindex Civil lib. 
log GDP 0.710*** 
(0.197) 
Muslim -0.367 
(0.319) 
Christian -0.392 
(0.288) 
Ethnic frac. -0.334 
(0.628) 
Literacy pop. -0.928 
(1.070) 
Openness 1.457 
(1.106) 
Not colony 0.135 
(0.315) 
British colony 0.478 
(0.298) 
constant -3.305** 
(1.634) 
Number of obs. 103 
R2 0.576 
Adjusted R2 0.491 
Prob > F 0.000 


HC3 robust standard errors in brackets. 
Regional dummies included in all estimations. 
x p < 0.10, ** p < 0.05, *** p < 0.01 


Model 2 


0.033 
(0.023) 
0.022 
(0.032) 
0.009 
(0.010) 


0.039 
(0.025) 


0.738*** 


(0.212) 
-0.271 
(0.394) 
-0.240 
(0.341) 
-0.364 
(0.824) 
-1.122 
(1.193) 
1.752 
(1.435) 
0.146 
(0.410) 
0.313 
(0.391) 
-3455* 
(1.964) 


86 
0.580 
0.474 
0.000 


Model 3 


0.032* 
(0.018) 
0.011 
(0.031) 
0.001 
(0.010) 


0.263 
(0.231) 


-1.730*** 
(0.593) 
0.766*** 
(0.193) 
0.049 
(0.305) 
-0.280 
(0.283) 
-0.267 
(0.595) 
-0.470 
(1.009) 
1.199 
(1.063) 
0.331 
(0.300) 
0.611** 
(0.298) 
-3.364** 
(1.687) 


103 
0.613 
0.530 
0.000 


Model 4 


0.037 
(0.023) 
0.006 
(0.034) 
0.004 
(0.011) 


0.032 
(0.023) 


-1.624* 
(0.866) 


0.821*** 


(0.209) 
0.107 
(0.363) 
-0.131 
(0.329) 
-0.124 
(0.809) 
-0.831 
(1.091) 
1.455 
(1.378) 
0.197 
(0.362) 
0.407 
(0.387) 
-3.809* 
(2.108) 


86 
0.607 
0.501 
0.000 
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Table 3.7: Linear regressions with dependent variable ICRG 
Model1 Model2  Model3 Model 4 


Representation of women 


Parliament 0.015 0.012 0.016 0.016 
(0.018) (0.020) (0.014) (0.017) 
Managers 0.025 0.025 0.010 0.011 
(0.021) (0.021) (0.017) (0.019) 
Labor force -0.003 -0.000 -0.009 -0.006 
(0.007) (0.008) (0.007) (0.008) 
Democracy 
Electoral democ. 0.273 0.221 
(0.234) (0.223) 
Polity2 0.029 0.027 
(0.025) (0.025) 
Social inst. related to gender ineq. 
Subindex Civil lib. -1.488***  -1.260** 
(0.425) (0.604) 
log GDP 0.122 0.081 0.153 0.123 
(0.149) (0.182) (0.135) (0.166) 
Muslim -0.337 -0.229 0.076 0.070 
(0.293) (0.316) (0.261) (0.315) 
Christian -0.351 -0.321 -0.300 -0.289 
(0.272) (0.338) (0.257) (0.333) 
Ethnic frac. 0.507 0.349 0.655 0.652 
(0.427) (0.465) (0.410) (0.496) 
Literacy pop. -0.165 0.118 0.404 0.436 
(0.930) (0.988) (0.769) (0.873) 
Openness 1.277**  1.523** 0.991* 1.274** 
(0.625) (0.650) (0.588) (0.596) 
Not colony 0.033 0.122 0.255 0.177 
(0.237) (0.304) (0.308) (0.396) 
British colony -0.022 -0.055 0.131 0.067 
(0.228) (0.289) (0.210) (0.293) 
constant 0.474 0.529 0.461 0.351 
(1.082) (1.193) (0.924) (1.094) 
Number of obs. 86 72 86 72 
R2 0.361 0.423 0.462 0.482 
Adjusted R2 0.201 0.241 0.318 0.306 


Prob > F 0.005 0.001 0.000 0.001 


HC3 robust standard errors in brackets. 
Regional dummies included in all estimations. 
x p < 0.10, ** p < 0.05, *** p < 0.01 
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3.5 Figures 


Figure 3.1: Scatter plot: Subindex Civil liberties and percentage of Muslim population 
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Regional growth convergence in 


Colombia 


Essay 4 


Revisiting the regional growth 
convergence debate in Colombia using 


income indicators 


Abstract 


This paper investigates growth convergence across Colombian departments during the period 
of 1975 to 2000, following both the regression and the distributional approaches suggested 
in the literature, and using two income measures computed by Centro de Estudios Ganaderos 
(CEGA). We also discuss issues related to data provided by Departamento Administrativo 
Nacional de Estadísticas (DANE) used by previous convergence studies. Our results show 
no evidence supporting convergence using per capita gross departmental product, but rather 
persistence in the distribution. Using per capita gross household disposable income, we 
find some evidence of convergence, but only at a low speed, close to one percent per year. 
Furthermore, we find no evidence of the existence of different steady states for the two 


variables considered. 


Based on joint work with Adriana Cardozo. 
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4.1 Introduction 


One of the most interesting and disputed questions in the economics discipline during the 
last half century has been whether or not poor countries tend to catch up with wealthier ones 
over time or if, on the contrary, the gap between the rich and poor widens. This question 
also reflects an interest in understanding the distribution of outcomes across countries and, 
implicitly, the determinants of growth (Durlauf et al., 2005). 

Empirical research on this topic is based upon macroeconomic aggregates and has con- 
centrated on testing the neoclassical growth model of Solow (1956) using the estimation 
method proposed by Barro and Sala-i-Martin (1991) to investigate whether economies with 
lower capital per person at a certain initial point in time tend to grow more quickly than 
economies with higher capital per person. If this is the case, there is convergence among 
economies over the long run. 

The convergence question has also been studied within particular countries to analyze 
how much regional disparities diminish over time. The difference with cross-country con- 
vergence analysis is that in such cases it is risky to make assumptions across countries on 
key model parameters, such as technology, savings, and population growth rates. On the 
contrary, within a single country, it is plausible to assume that regions exhibit similarities 
in these and other variables, such as language, institutions, and preferences. This presumed 
homogeneity has lead researchers to assume that convergence is more likely to hold within, 
rather than across, countries (Barro and Sala-i-Martin, 2004). 

Empirical research supports regional convergence within industrial countries over the long 
run. Typical examples are given by Barro and Sala-i-Martin (1992b) who find convergence 
across U.S. states between 1880 and 2000, across Japanese prefectures between 1930 and 
1990, and between regions in eight European countries between 1950 and 1990 (see also 
Barro and Sala-i-Martin (1992a)). 

In the case of Colombia, a heterogeneous country at the department level in economic, 
geographic, and cultural aspects, existing research is contradictory. While some authors 
argue that Colombia was a successful case of convergence in the second half of the twentieth 
century, others argue for the persistence of regional disparities. 

The objective of this study is to investigate whether or not Colombia was a case of conver- 
gence at the department level between 1975 and 2000 using two different income variables: 
gross departmental product and gross personal disposable income. We consider that the 
second variable is more appropriate for measuring convergence in well-being. 

The study is constructed around three main questions. First, the study evaluates whether 
departments converged between 1975 and 2000 and if so, if convergence results obtained 
using the regression approach contradict the results obtained with the distributional approach 
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suggested by Quah (1997), using bivariate Kernel density estimators. Second, we determine 
if the assumption of a common steady state for all departments holds or whether there is 
evidence of heterogeneity in the model parameters. Finally, the study evaluates whether the 
presence or absence of convergence occurs simultaneously in gross departmental product 
and in gross personal disposable income. 

An important contribution of the study is the first ever test of the convergence hypothesis 
using time-series cross-sectional Colombian data with different specifications to check the 
robustness of results. The results are based upon data from Centro de Estudios Ganaderos 
(CEGA) because those data provide the longest time series (25 years) computed with a con- 
sistent methodology.! 

To summarize our results, we do not find convergence in gross departmental product and 
find no evidence of different steady states across departments using that variable. When 
using gross personal disposable income, we find convergence, but a very slow one, and no 
evidence of different steady states. For both variables, when using the regression approach, 
we find that the best estimators can be achieved using pooled time-series cross-section data 
and assuming homogeneity in the parameters. Furthermore, considering both variables, we 
do not find a contradiction in results obtained using the regression and the distributional 
approaches. Using bivariate kernel density estimators, we find persistence in the distribution 
of gross departmental product and slight convergence in gross personal disposable income. 

One important policy implication of our results is the need to periodically review whether 
or not departmental disparities diminish over time based on consistent time series constructed 
under a single methodology. We explicitly warn that linking different time series computed 
with different methodologies can lead to incorrect conclusions for interventions, such as 
poverty-alleviating policies and growth strategies. In keeping with previous studies on this 
topic (e.g. Bonet and Meisel, 20062), we consider important the need to have an explicit re- 
gional policy in Colombia to foster growth in departments lagging behind national averages, 
after conducting case studies to assess which policies could be most effective in each case. 


4.2 Motivation and Background 


4.2.1 Economic Background 


One remarkable characteristic of Colombia is the large income inequality which exists at dif- 
ferent levels-between individuals, between rural and urban areas, and between departments. 
The country is currently divided into 32 departments and the capital district of Bogota. De- 


1CEGA was a large research center financed by a private financial institution in Colombia. 
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partments may also be grouped into 5 regions: the Caribbean Region comprising departments 
with access to the Caribbean Sea; the Pacific Region, with departments in the west coast to 
the Pacific Ocean; the Central Region, covering the three branches of the Andes mountain 
chains; Orinoquia, comprising large plains to the south-east of the country; and Amazonia in 
the south, comprising the Colombian part of the Amazon rainforest (see the map of Colombia 
in Figure 4.1). 


Economic growth over the last 30 years, which was low but stable compared to other 
countries in the region, comes together with a combination of a high incidence of poverty, 
inequality, and violence. In 2004, the percentage of people living below the poverty line 
(headcount index) was 52 percent and the Gini coefficient was 0.58. The homicide rate was 
63 per 100,000 people. Evidence shows that growth slowed compared to long-term historical 
trends after 1970. In fact, after having achieved in 1970 a growth rate of 3.1 percent in per 
capita gross domestic product, growth between 1980 and 1990 occurred at an average annual 
rate of only 1.2 percent due primarily by the adverse effects of Latin America's debt crisis. 
In the 1990s, the average growth rate was similar (1.1 percent), driven by a boom and bust 
cycle throughout the decade, which concluded in a severe recession in 1999 (per capita GDP 
contracted by 5.5 percent (Table 4.1)). On the contrary, between 2002 and 2007, favorable 
external conditions, especially high commodity prices and confidence due to the easing of 
internal conflict, have contributed to the acceleration of the economy (Tenjo G. and López E., 
2003; Cárdenas, 2007). 


The heart of economic activity in Colombia lies in the Central or Andean Region which 
concentrates the largest proportion of population within the major cities. Bogotá and the de- 
partments of Cundinamarca and Antioquia account for 42 percent of total GDP with Bogotá 
having a high level of participation in total production (22 percent). This area concentrates 
not only manufacturing industry and commerce near the cities, but also coffee plantations 
and other large-scale agricultural areas. 


The GDP of departments in the Caribbean Region is based upon mining, small-scale agri- 
culture, and cattle farming. La Guajira and Cesar are the two largest producers of coal, while 
Córdoba is the largest nickel producer. Despite having some departments rich in minerals, 
this region nevertheless has a high incidence of poverty, particularly in Córdoba and Sucre. 


The Pacific Region comprises, relative to the Colombian average, three poor departments 
and one wealthy one (Valle del Cauca). Chocó, which is the poorest department in this re- 
gion and in the country, is predominantly rural and sparsely populated, with large tropical 
rain forests and humid areas. It is known as the rainiest area in the country (and even one 
of the rainiest worldwide) and is geographically isolated from the rest of the country due to 
a chain of mountains to the east and the ocean to the west. Transport of population living 
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in the department is largely done by way of its abundant creeks and rivers; road infrastruc- 
ture is minimal. The scarce literature explaining socio-economic factors in this department 
argues that the current distribution of population and the quality of institutions may largely 
be explained by the early settlement of an extractive economy during colonization, at which 
time colonizers brought slaves to exploit gold mines but did not establish themselves in the 
department (Bonet, 2007). As opposed to Chocó, Valle del Cauca is the third largest de- 
partmental economy in the country after Bogotá and Antioquia and has some of the most 
productive agricultural areas, as well as a high level of participation in the manufacturing 


sector. 


During the last 30 years, production was driven in some departments by the discovery 
of important mineral resources, as is the case for the departments of Arauca and Casanare, 
which have the largest oil fields in the country.? The same applies for La Guajira, which has 
the largest open coal mine in Latin America. 


According to Meisel (2007b), the burden of poverty in Colombia is geographically lo- 
cated in the coastal departments and inequality is greater between departments than within 
them. Meisel argues that the urban-versus-rural divide is not the relevant dimension upon 
which to design poverty-alleviating programs, but the departmental one. Moreover, Meisel 
affirms that the already-large disparities have increased over the past 15 years and will not 
spontaneously disappear merely as a result of market forces. 


The level of empirical research addressing regional disparities in Colombia has increased 
gradually since the early nineties, inspired by the international debate on convergence and 
the methodology proposed by Barro and Sala-i-Martin (1991). Since then, approximately 20 
papers have investigated whether departments, regions, or even major cities have converged 
over time. Important shortcomings in this field arise due to the absence of consistent time- 
series data allowing for a long-term perspective. As a consequence, results frequently depend 
upon how the researcher combined the available time series, as well as on the methodology 
and control variables used, with no robust and undisputed evidence concerning departmental 


convergence. 


Debate in this field in Colombia revolves around two issues: first, a methodological dis- 
cussion as to whether or not to rely on the methodology proposed by Barro and Sala-i-Martin 
(19922) or on the distributional approach proposed by Quah (1993b), and second, whether 


These departments are included in our sample as one group named Nuevos Departamentos (Nuevos), meaning 
new departments. The so-called New Departments are distributed in the south-east lowland plains, the Amazon 
Region, and the Caribbean islands. Excepting the islands, these departments are large in extension but have 
low population densities. 
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one should use information generated by Departamento Nacional de Estadísticas (DANE), 
rather than by Centro de Estudios Ganaderos (CEGA).? 

Early studies used Barro-type regressions. The pioneer work of Cárdenas and Pontón 
(1995), combining early GDP data by department from the National Planning Department 
with those produced by DANE, concluded that between 1950 and 1990, Colombia was a 
successful case of convergence with a 4-percent speed of convergence, and that migration 
played an insignificant role in convergence. Alternative combinations of data from DANE 
yield different results, despite using the same methodology. For instance, Barón (2003) finds 
convergence during the eighties but not during the nineties. Research using kernel density 
estimators concluded that Colombia was a case of polarization with the existence of three 
groups: a wealthy one that diverges from the average national income, a middle income 
one that shows convergence inside the group, and a third one that grows more impoverished 
over time (Birchenall and Murcia, 1997). Using CEGA data, research points to polarization 
in favor of the capital district of Bogotá, to the detriment of departments located in the 
peripheries (Bonet and Meisel, 2006b). Almost all studies focus only upon convergence in 
income, while only three ask for convergence in living standards using social indicators.* 

The reader is then confronted with the question of whether Colombia is a successful 
case of convergence or rather an example of hopeless persistence unless strong regional 
redistributive policies are adopted.? 

Intuitively, when observing the different geographic conditions of the country and the 
agglomeration processes around the largest cities, as well as the differences in infrastruc- 
ture, it is unrealistic to expect that poor departments can catch up with leading departments 
in terms of per capita product, given that they lack basic infrastructure and have a minor 
manufacturing and government presence. 

However, there are mechanisms that could have promoted convergence among depart- 
ments in recent years. One of them is fiscal equalization through central government trans- 
fers. Starting in the mid eighties, the government implemented a decentralization program to 
reduce the burden of spending by the central government. This process accelerated with the 
new constitution of 1991 which established a new system of transfers in order to increase the 
efficiency of social expenditures, as well as the supply of social services, compensated mu- 
nicipalities with weak financial capacities, and increased political power and the participation 
of local governments in the implementation of health and educational policies (Departamento 


3DANE is the official statistical agency in Colombia (http: //www.dane.gov.co/). 

^A comprehensive list of convergence studies in Colombia can be found in Aguirre (2008). We deal with 
regional convergence in social indicators in Colombia in Essay 5. 

Research using alternative methodologies and looking for linkages among regions found that Colombia has 


limited spatial interdependency (Haddad et al., 2008). 
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Nacional de Planeación DNP, 2002; Rojas, 2003; Barrera and Domínguez, 2006). As a re- 
sult, social spending increased from 7 to 15 percent between 1991 and 2001. Moreover, 
starting in the eighties, the model of industrialization through import substitution changed 
to the policy of liberalization of the economy, reduction of tariffs, and integration into the 
world markets in order to increase competitiveness, productivity, and economic growth. This 
shift also accelerated after the constitutional reform. 

Another possible mechanism for convergence is migration. In general, the country un- 
derwent an important urbanization process in recent decades encouraged by industrialization 
around urban centers and a higher orientation towards export markets. Labor mobility was a 
combination between voluntary migration for economic gain (which profited from increas- 
ing returns to scale in the manufacturing sector) and forced migration due to violence, which 
migration helped enlarge informal markets. Migration from rural to urban areas accelerated 
during the twentieth century. The percentage of population which is urban changed from 59 
percent in 1973 to 75 percent in 2005 due not only to a transformation from a predominantly 
agriculture-based economy to a services and industry-based one, but also due to conflict, 
violence, and a lack of opportunities in rural areas (Murad R., 2003). 

In this context, the substantive question we try to empirically answer in this study is 
whether or not Colombia was a case of convergence at the department level between 1975 
and 2000. Thus, if poor departments had greater growth rates than wealthy ones over time 
and the gap between them decreased. Our interest relies upon closing the debate on the ex- 
istence of convergence across departments in Colombia by analyzing methodological issues 
and data sources that may have had affected results up to now. One important motivation 
of this study is the policy implication that can result as a consequence of wrongly assuming 
that departments converge automatically over time. 

In order to explain the importance of the data used and the possible combinations of time 
series, we explain in the next subsection the available data sources and the relevance of two 
variables, gross departmental product and gross personal disposable income, arguing that the 
second one is more appropriate for measuring convergence for well-being. 


4.2.2 Data Issues Affecting Convergence Results in Colombia 


There are two different data sources of departmental accounts in Colombia: Department of 
Statistics (DANE) and Centro de Estudios Ganaderos (CEGA). 

DANE provides per capita GDP by department for three different periods: one for 1980 
through 1996 in constant prices as of 1975, one for 1990 through 2005 in constant prices as 
of 1994, and a final one for 2000 through 2005 in constant prices as of 2000. The first period 
was calculated applying concepts of the System of National Accounts of 1986 (SNA-86) 
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and used an indirect method for collecting information. The second period was calculated 
using the System of National Accounts of 1993 (SNA-93) and combined direct and indirect 
methods for collecting information.Ó The third period did not include illicit crops in its 
estimation and is also based upon SNA-93. 


The classification of sectors, transactions, concepts, and methodology changed consid- 
erably in the SNA-93 and allowed for the inclusion of illegal activities as part of the GDP 
(DANE, 2008). It must be noted that statistical offices use different techniques to pro- 
duce consistent time series of national accounts, particularly when international guidelines 
change. For instance, most (OECD) countries make regular revisions for short time periods 
(usually of about twenty years) to incorporate new available information and benchmark re- 
visions, in order to provide users with consistent time series. In Latin America, only Chile 
and Perü offer consistent large time series of regional per capita GDP using statistical or 
interpolation methods (Serra et al., 2006). 


In Colombia, DANE collected information for some overlapping years using both method- 
ologies, but did not construct a consistent time series based only upon one. Although users 
do not have enough information to consistently recompute long time series, they tend to 
rebase series and connect them using growth rates, which can be problematic.? 


Comparison of the series for the overlapping years shows different departmental growth 
rates and a different evolution of the logarithm of the standard deviation, explaining why 
convergence results change depending on how and when the researcher linked the different 
data series. Note in Figure 4.6 that the annual standard deviation of the logarithm of GDP of 
the three series of DANE yields different patterns. In the series of 1980 to 1996, the standard 
deviation increases sharply starting in 1990, while in the series from 1990 to 2005, it remains 
close to 0.36 until 1997 and falls thereafter. Concerning the third time series (2000 to 2005), 
the trend is similar to the series for 1990 through 2005, but the level of the standard deviation 
is higher. 


Direct methods take departmental information by product whenever data sources are available. Indirect ones 


use national aggregates and assign each department a percentage of those aggregates. 
7The main changes concern the measurement of value-added taxes, the reclassification of transactions in the 


government sector, changes to the capital account, and productivity levels for the banking, energy, and insur- 


ance sectors. 
8Techniques can be broadly classified into four groups: detailed reworking, proportion methods, interpolation 


between benchmarks, and indicator methods. 
?For instance, the Canadian statistical office explicitly prohibits users from simply rebasing series using growth 


rates due to the large methodological differences derived from changing to SNA-93, and argues that only 
the statistics office in charge may compile series using detailed accounting and recomputing information 
according to the new procedures (Lal, 1999). 
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The CEGA project compiled information at the departmental level in Colombia from 1975 
to 2000 using SNA-93 and presented a simplified system of national accounts. The project 
used mixed methods for collecting information, classified some particular products differ- 
ently than DANE, and did not include illicit crops in the agriculture category. Departmental 
results coincide between CEGA and DANE from 1990 onwards because both use SNA-93 
(there are, however, important differences before 1990). 


CEGA produced consistent time series of two key variables relevant for convergence anal- 
ysis, gross departmental product, which we will call henceforth PDB, and gross departmental 
income, which we will refer to as IDB. The first variable reflects production by residents in 
each department, while the second reflects the primary income received by those residents. 
The difference between these variables is the net external income of residents. 


CEGA also provided time series of gross household disposable income by department, 
which we will call IDBH, and which is the result of households' income after subtracting 
taxes on property and rental income and net payments to the social security, and adding 
other net current transfers. This variable is a more accurate measure of a population's welfare 
than per-capita PDB, as it reflects household income after taxes and net public and private 


transfers. !° 


Due to the advantages of the CEGA database, and as this database provides the only 
two consistent time series covering a long time span, we present results and discussion on 
convergence for per capita PDB and IDBH.!! Our final data set covers the period of 1975 
to 2000 for 23 departments, the capital district of Bogota, and the nine “New Departments” 


grouped into one observation, for a total of 25 units and 25 years.!? 


To calculate yearly per capita figures of both PDB and IDBH, we use the latest available 
population data, computed after reconciliation of the census of 2005 with previous censuses. 
According to the census of 2005, population is less than what had been forecasted using 
the 1993 census due to a lower birth rate and increased external migration (DANE, 2007). 
Although in most of the departments population was overestimated, there are some particular 
cases in which the contrary situation applies. 

We use yearly population data at the departmental level from DANE (2007) for the years 
1985 to 2000, and for the years 1975 to 1985, we interpolated departmental population using 


10The abbreviations used refer to the original names in Spanish: Producto Departamental Bruto (PDB), Ingreso 


Departamental Bruto (IDB) and Ingreso Departamental Bruto (disponible) de los Hogares (IDBH). 
11 As will be explained in the next section, it would be best to work with data expressed as per unit of effective 


worker, but due to data availability, researchers often use per capita figures. 
12The New Departments have existed formally since the 1991 constitutional reform when nine former in- 


tendancies and commissariats were acknowledged as departments (Amazonas, Arauca, Casanare, Guainía, 
Guaviare, Putumayo, San Andrés y Providencia, Vaupés, and Vichada). 
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the annual growth rate from 1973 to 1985 based on the 1973 census. The obtained val- 
ues show a consistent evolution of population by department once connected to the official 
estimates from 1985 onwards. 

Box plots of per capita PDB and IDBH in logs are shown in Figures 4.2 and 4.3. Box plots 
of relative PDB and relative IDBH in logs are shown in Figures 4.4 and 4.5. By relative we 
mean that the variables are expressed as ratios to the national average of the corresponding 
year. We can see that the ordering of departments is similar in both types of graphs, in levels 
and relatively, particularly in the upper and lower ends. The five departments with the lowest 
per capita PDB are Chocó, Sucre, Córdoba, Narifio, and Cauca, four of these being located 
on the Pacific Coast. Bogotá, Valle, Antioquia, Nuevos Departamentos and Cundinamarca 
have the five highest PDBs. Concerning per capita IDBH, departments with the lowest values 
are almost the same, excepting Santander instead of Nuevos. 

The box plots show large variability in per capita figures of Guajira in both PDB and 
IDBH and low variability for Bogotá. This pattern is accentuated in figures relative to the av- 
erage, as well as for the group of Nuevos and observing PDB. On the contrary, the log of per 
capita IDBH shows less variation and dispersion of values, but a higher difference between 
the richest and the poorest departments. Note also that the group of Nuevos Departamentos 
has large variability in PDB. That variability is not visible in IDBH. In the following two 
sections, we present two well-known approaches for testing for convergence-the classical 


approach to convergence analysis and the distributional approach. 


4.3 The Solow Model and Its Estimation 


4.3.1 The Solow Model 


Empirical testing of convergence across economies is based upon the neoclassical growth 
model developed by Solow (1956)? in which economies have a transition dynamic towards 
the steady state, defined as a situation in which all variables per unit of effective worker 
remain unchanged over time. In the steady state, the ratio of capital to labor is constant given 
that the capital stock expands at the same rate as the labor force, and the capital expansion is 
sufficient to compensate for it. 

The neoclassical growth model assumes diminishing returns to factors and constant re- 
turns to scale. Due to this assumption, real returns of factors adjust to bring about full em- 
ployment of labor and capital. Technology is exogenous and is the only force that explains 
changes in output and capital per worker. Any capital-to-labor ratio different than the one 


13The neoclassical model was also developed in the original works of Ramsey (1928) and Cass (1965). 
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needed in the steady state readjusts as time passes so that economies tend towards the steady 
state. The speed at which this happens is known as the convergence rate and is inversely 
related to the distance from the steady state (Durlauf, 1996). 

Under the framework of the neoclassical growth model, smaller initial values of the 
capital-to-labor ratio k, are associated with greater growth rates of the ratio production per 
worker. Robert Barro and Xavier Sala-i-Martin tested whether economies with lower capital 
per worker at a certain initial point in time grew more quickly in per-worker terms (Barro 
and Sala-i-Martin, 1991, 1992a,b, 2004; Sala-i-Martin, 1996), using the following equation: 


log[? (r)] = (1— exp P ^)log(?*) + exp?” log[f (0)]), (4.1) 


where f represents time, B* indicates how rapidly an economy's output per effective worker 
Y approaches its steady-state value ?* in the neighborhood of the steady state. The corre- 
sponding definition of ß* with a constant saving rate s is B* = (1 — o) (x+n + 6), where о is 
a constant representing the share of capital in production, п is the rate of population growth, 
x is the rate of exogenous growth, and 6 is the depreciation rate. The speed of convergence 
is measured by how much the growth rate decreases as the capital stock increases in a pro- 
portional manner.!^ Equation 4.1 implies that the average growth rate of per-capita output Y 
over an interval from an initial time 0 to any future time 7 (higher than 0) is 
log[Y (7T)/Y (0)] (1 -exp 97) 


=x+ —— legi?*)/? (0)], (4.2) 


where x is the rate of technological progress or the steady-state growth rate.!? Equation 4.2 
also shows that the effect of the initial position f (0) is conditioned on the steady-state posi- 
tion Y* (conditional convergence) (Barro and Sala-i-Martin, 2004). The approach suggested 
by Barro and Sala-i-Martin (2004) is known as the regression approach or as the classical 
approach to convergence analysis (Sala-i-Martin, 1996; Magrini, 2004). There are two al- 
ternatives for applying this concept — testing for absolute convergence or for conditional 


convergence. 


4.3.2 Absolute Beta-Convergence 


The concept of absolute beta-convergence (also known as unconditional convergence) is 


relevant for a group of closed economies that are structurally similar; they have the same 


l4Note that B* is not the same as B. It is the convergence rate in the proximity of the steady state and is 


determined by (1 — о) for given values of x, n, and ô. 
I5Eguation 4.2 indicates that the coefficient (1 — exp P7 )/T declines, the higher Т is for a given p, and as 


long as В is positive. Therefore, the average growth rate of Y decreases as Т — оо (and thus x) dominates the 
average growth rate. In contrast, for a given Т, a higher ß implies a higher coefficient (1 exp ?7)/T. 
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values of the parameters x, s, n, and б, and thus they have the same production function 
steady-state values k* and Y*. The only difference is the initial quantity of capital per person 
k(0), which reflects past disturbances (wars, transitory shocks to production, etc.). Hence, 
economies with lower values of k(0) and Y (0) have unambiguously greater growth rates of 
К and Y. The estimation equation for absolute convergence is equation 4.2, omitting the Ӯ" 


term: 


1 mE ee — HET 
log[Y,;/Yi,-r] == (rep) log|Y,,—7] + wi,r, en 


T 
where и; т represents the effect of the error terms w; between dates ¢ and Т, i is the corre- 
sponding subscript for each region or country, and a = x + (1 — exp P" )log(f*). Absolute 
convergence arises when the term multiplying the initial income is negative, and implies that 
poor economies tend to grow more quickly than wealthy ones. One can estimate a regression 
with non-linear least squares (NLLS) to obtain the speed of convergence B directly. 


4.3.3 Conditional Convergence 


Conditional beta-convergence arises by allowing for heterogeneity across economies, par- 
ticularly by dropping the assumption that all economies have the same parameters and the 
same steady state.!ó The main idea is that the further an economy is from its own steady-state 
value, the more quickly it grows: 


logY;/Y,-r] , (1 -exp P7) 
T 


T log[|Y;, 7r] + YX; + wir, (4.4) 


where X; is a set of variables that proxy for the steady-state level of income (£7). Empir- 
ical studies show little evidence of unconditional convergence for large and heterogeneous 
samples of countries. Instead, they tend to find conditional convergence in economies with 
similar structural characteristics (Barro and Sala-i-Martin, 1991) with speeds of convergence 
usually around 2 percent. However, there is no agreement on which variables to include as 
proxies for the steady state, and their selection depends mostly upon the researcher interest. 
An extensive review made by Durlauf et al. (2005) shows a list of about 145 different re- 
gressors used in convergence literature and points out that most of them have been found to 
be statistically significant. These regressors are classified by Durlauf et al. (2005) into 43 
distinct growth theories or growth determinants, raising doubts about their usefulness. 


léUnder the assumption of different parameters, Equation 4.3 would provide biased estimates because the 
steady-state level of income p would be correlated with the explanatory variable log[Y; т). To solve this 
problem, Barro and Sala-i-Martin (19922) suggest incorporating into the regression a set of variables X; as 
proxies for the steady-state level of income (1) and testing for conditional convergence. 


4.3. THE SOLOW MODEL AND ITS ESTIMATION 117 


4.3.4 Parameter Heterogeneity: Are There Different Steady States? 


An alternative way to estimate conditional beta-convergence is to remove the assumption of 
parameter homogeneity, as suggested by Canova and Marcet (1995) and Maddala and Wu 
(2000), using time-series cross-sectional (TSCS) data.!" Advocates of this approach argue 
that the Barro-type growth regressions create biases in the estimated coefficients by pooling 
data whenever there is heterogeneity in the parameters. Moreover, cross-sectional regres- 
sions lead to a waste of information, since they ignore unit-specific time variations in growth 
rates and prevent the estimation of a steady state for each region or country separately (e.g. 
Lee et al., 1997; Temple, 1999; Pritchett, 2000; Durlauf, 2001; Brock and Durlauf, 2001; 
Masanjala and Papageorgiou, 2004).!? 

Canova and Marcet (1995) propose a way to model heterogeneity and calculate steady 
states for each unit without proxying for the steady state of income with additional variables. 
The model allows calculation of the speed of adjustment for each unit to its own steady state. 
A weakness of the approach is the need for the time dimension * to be large; otherwise, 
estimates will have large standard errors and their small sample distribution may strongly 
deviate from the asymptotic one. Using cross-country data, they find an average speed of 
adjustment to be close to 11 percent, but reject the hypothesis of equal steady states for all 
cross-sectional units.!? Using an iterative Bayesian approach with a similar cross-country 
data set, Maddala and Wu (2000) find average annual convergence rates of around 5 percent 
and further argue in favor of different steady states for each country. 

The estimation relies upon transforming equation 4.2 in discrete time as follows: 


log(y; 7) = @+ prlog(yio) + УХ; + ui, (4.5) 


where yj, is relative output per worker, which will be defined below, pr = exp PT, t= 
0,1,2, ..., T, and the variables X; are introduced to allow for shifts in the limit of the steady 
state means of yj. The key to allow for parameter heterogeneity relies in dropping the as- 
sumptions that B; = B and a; = a V;. The first assumption is expressed by p; Z p; that is to 


For a description of time-series cross-sectional data, see, for example, Beck (2001) and Beck and Katz (2007). 
18 As indicated by Masanjala and Papageorgiou (2004), parameter heterogeneity in growth regressions has at 


least three interpretations: there are (i) multiple steady states, i.e., the parameters of a linear growth regres- 
sion are not constant across countries (e.g. Durlauf, 1996), (ii) omitted growth determinants (e.g. Durlauf 
and Quah, 1999), and (iii) nonlinearities of the production function, i.e., the identical Cobb-Douglas ag- 
gregate production function may be unsuitable. After investigating the third interpretation, Masanjala and 
Papageorgiou (2004) conclude that using more general constant elasticity of substitution aggregate produc- 
tion functions does not explain away heterogeneity across countries, and they consequently suggest shifting 


attention to the other two interpretations. 
1? According to Shioji (1997) their convergence rates are high due to the type of Bayesian approach and the 


short period used (10 years). 
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say, the convergence rates among all economies are allowed to be different. After grouping 
о; = a+ YA;, the final estimation is 


log(yi;) = 0; + pilog(yi;—1) + ui. (4.6) 


Note that both Canova and Marcet (1995) and Maddala and Wu (2000) use relative per 
worker (capita) output y; for the estimation, defined as Y; s, i.e., per capita output of region i 
in period £, divided by the national average of output per capita in year t. A value higher (or 
lower) than 1 means that the region has a higher (or lower) per-capita output than the national 
average. Using y;, instead of Y;, has the advantage that the linear trend term disappears, as it 
is assumed that in steady state all y; , should grow at the same rate of technological progress, 
although the levels may vary. It also corrects for problems of serial and residual cross-unit 
correlation and avoids specifying a process for growth, that is, whether it is trend or unit-root 
with drift (Maddala and Wu, 2000). 

For each region, Equation 4.6 is an AR(1) process of /og(y;;). If |p| < 1, the time series 
is stationary and given that E(log(y;;)) = E(log(yi;—1)), one could estimate the expected 


value as 
É (log yis)) = у> ал) 
Pi 
where б; and p; are obtained from regressions based on Equation 4.6. 

According to Maddala and Wu (2000), the condition |p| < 1 ensures that region i con- 
verges towards its own steady state and is equivalent to the definition of beta-convergence in 
Barro and Sala-i-Martin (1992a). As long as |p| < 1, the speed of adjustment of each unit to 
its own steady state is given by 1 — pj. 

Concerning the empirical estimation, and as discussed by Maddala and Wu (2000), equa- 
tion 4.6 can be estimated by (i) pooling the data and assuming that V; о; = о and p; = р, (ii) 
running 25 separate regressions, one for each department, allowing for 25 о; and pi, or (iii) 
through shrinkage estimators that assume that о; and p; have two components, one fixed and 
one random. Additionally, one could estimate Equation 4.7, assuming that there is a fixed 
number of groups, allowing, for example, for three values of с and p, in other words, о, 
0%, оз and р, p», рз. The departments that belong to each group should be identified with 
the appropriate method. 

We will estimate equation 4.6 following all the alternatives presented. 
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4.3.5 Sigma-Convergence 


An alternative to evaluating beta-convergence is to focus on whether there is a reduction over 
time in the dispersion of real per-capita income across entities, indicating a more equitable 
distribution of income. This is called sigma-convergence and arises when for T > 0 


O,+T < Or, (4.8) 


where о, is the standard deviation of real per-capita income in period t (Sala-i-Martin, 1996). 
The existence of beta-convergence tends to generate sigma-convergence. However, there 
are cases in which shocks affecting each entity differently lead to the existence of beta- 
convergence but the lack of sigma-convergence. The example given by Sala-i-Martin (1996) 
in this regard is clear. Assume two economies, one rich and one poor. The initial poor 
economy grows so quickly that in the final period its distance from the rich one is the same 
as before, except that now the poor economy is the wealthier. In such a case, the resulting 
standard deviation would be the same in the initial and final period. One would observe 
beta-convergence, given that the poor economy is growing more quickly than the rich one, 
but no sigma-convergence. Hence, sigma-convergence is an indicator of dispersion of the 
overall entities, but does not tell much about mobility of each one. Beta-convergence is thus 
a necessary, but not sufficient, condition for observing sigma-convergence. 


4.4 Distributional Approach: Quah's Critique 


One important critique to the standard regression approach was raised by Danny Quah (Quah, 
1993a,b, 1996, 1997), who argues that neither beta nor sigma-convergence can deliver useful 
answers to the question of whether poor countries or regions are catching up to wealthier 
ones. Quah argues that the classical approach does not give any information about mobility, 
stratification, or polarization, and suggests that the typically obtained 2-percent speed of 
convergence is a statistical artifact that arises in moderate size samples for reasons other than 
convergence (Durlauf et al., 2005). In his analysis using cross-country data, Quah finds some 
evidence of convergence clubs, but also evidence of poor countries becoming progressively 
poorer and wealthy countries, even wealthier. 

Quah initially suggested working with a sequence of income distributions and, after dis- 
cretizing the space of income values, counting the observed transitions into and out of the 
distinct cell values to construct a transition probability matrix (Quah, 1993a,b). Later, Quah 
(1997) argued that the discretization could distort dynamics if the underlying observations 
are indeed continuous variables. He proposed thinking of the distinct cells as tending to- 
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wards infinity and towards the continuum, with the transition probability matrix tending to a 
matrix with a continuum of rows and columns, that is, becoming a stochastic kernel.?0 

The methodology is based upon tracking the evolution over time of the entire cross- 
sectional distributions across regions through the estimation of kernel densities for "relative" 
variables, which means that the variables of interest are expressed as being relative to the 
national average, allowing abstraction from changes in the mean when one evaluates how 
the distribution changes. 

Before we define how we proceed to test for convergence using the distributional ap- 
proach, we briefly present some concepts needed for our estimation?! 

For the distributional approach, all variables are expressed relative to the Colombian 
value. Additionally, we take the logarithm of the relative variable, as it facilitates the com- 
parison to the national level. Expressed in logs, a relative value equal to 0 indicates that the 
department has the same value as the country, while a value that is, for example, equal to 
-0.05 means that the value of the department is 5 percent lower than the national value. 

A univariate kernel density estimate may be regarded as a generalization of a histogram: 


А 14 fa-Q 
Ae = Ex (52) (4.9) 


where к is a kernel, m is the number of observations, and h > 0 is the bandwidth, also 
called the smoothing parameter." In the context of growth convergence, we are interested 
in checking whether we find unimodality or multimodality in the estimated densities of the 
logarithm of relative income, and in what way the estimated densities change between the 
starting and the final period. 

Bivariate kernel density estimation requires two-dimensional data and a two-dimensional 
kernel. Here, О = (Q1, Q2)” and the kernel К maps R? into R4}. The estimate is 


fil) = 5. Ў акна 00), (4.10) 


¡=1 42 


where K is a bivariate kernel function, m is the number of observations, and H is a symmet- 
rical bandwidth matrix. 

For the analysis of convergence, we estimate the bivariate kernel density for the relative 
variable in two periods and check whether or not a large portion of the probability mass 
remains clustered around the 45-degree diagonal, which would indicate persistence in the 


20For a technical derivation of a stochastic kernel see Quah (1997, section 4). 

21А review of the statistical principles of univariate and multivariate kernel density estimations can be found, 
for example, in Hárdle et al. (2004). 

22 Kernel refers to any smooth function satisfying the conditions x(q) > 0, f x(g)dq = 1, f qk(gq)dq = 0, and 
02 = f q?k(q)dq > 0 (Wasserman, 2006). 
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distribution. We present the 3D representation of the estimated bivariate density and a con- 


tour plot showing the highest density regions. 


4.5 Empirical Estimation and Results 


We empirically test for convergence in PDB and IDBH, using both the classical and distri- 
butional approaches to convergence, as we are interested in checking if, in the Colombian 
case, there is a contradiction of the results obtained when employing both approaches, as 
suggested by the existing literature on Colombia. We do not use population weights in our 
calculations, as we are interested in investigating whether or not departments that were lag- 
ging behind have been able to catch up, and consider this to be a pertinent question in the 
Colombian case where departments are important political entities, with elected local gov- 
ernments and separate department assemblies. 

Our empirical analysis begins with the classical approach, testing for sigma and beta- 
convergence. In the case of beta-convergence, we test absolute and conditional convergence. 
Conditional convergence is tested with cross-sectional regressions with control variables and 
also with AR(1) regressions using time-series cross-sectional data for relative income, start- 
ing with a pooled model that assumes homogeneity in the parameters and then allows for 
heterogeneity. 

We then follow the distributional approach and compute univariate and bivariate kernel 
density estimators for relative income in 1975 and 2000. 


4.5.1 Sigma-Convergence 


Results of sigma convergence are presented in Figure 4.7. As may be observed, there exists 
evidence of sigma-convergence in IDBH but not in PDB. From 1975 to 1984, the standard 
deviation of the log of both variables remains close to 0.40. From 1985 onwards, IDBH 
decreases and has a value close to 0.32 in 2000. On the contrary, PDB remains around 0.40. 
Thus, the distribution of IDBH has become more equitable, while the distribution of PDB 


has not. 


4.5.2 Absolute Beta-Convergence 


Figure 4.8 shows a weak inverse relationship between the growth rate of per-capita PDB 
between 1975 and 2000 and its value in 1975. Cross-sectional regression results based upon 
Equation 4.3 and using NLLS are shown in Table 4.2. We use HC3 robust standard errors 
as proposed by Davidson and MacKinnon (1993) to account for possible heteroscedasticity, 
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considering that the number of observations is small (Long and Ervin, 2000). The estimated 
speed of convergence is 0.7 percent, but it is not significantly different from 0 at the 5 percent 
level. The adjusted R-squared of the regression is extremely low (0.01) suggesting that this 
model does not explain departmental PDB growth rates. These results do not change if one 
excludes Chocó, Nuevos, and Guajira, which have a large influence on results, as suggested 
by Cook's distance computed after the first regression (Figure 4.9). 


In the case of IDBH, Figure 4.10 shows a stronger negative relationship between the 
growth rate of per-capita IDBH between 1975 and 2000 and its value in 1975. This is con- 
firmed with the regression presented in Table 4.3, where the estimated speed of convergence 
is 1.2 percent and statistically significant. The adjusted R-squared is 0.35. Excluding Gua- 
jira, as suggested by Cook's distance, and then rerunning the regression yields similar results. 


Hence, we find evidence of absolute beta-convergence using IDBH, but not using PDB. 


4.5.3 Conditional Beta-Convergence Using Control Variables 


As explained in Subsection 4.3.3, one may drop the assumption that all economies have the 
same parameters, and hence the same steady state, and try to proxy for the steady-state level 


of income with a set of variables X;, running regressions based upon Equation 4.4. 


There is no agreement as to which variables to include as proxies for the steady state with 
cross-sectional data (Durlauf et al., 2005). We use variables that are based upon theoretical 
arguments and our choice is limited by data availability at the departmental level. We use the 
logarithm of population growth and a variable based upon saving rates. Additionally, we 
use three variables proxying for human capital: log of life expectancy in 1975, log of literacy 
in 1973, and log of net enrolment rate in 1985. Several specifications for the average growth 
rate of per-capita PDB are shown in Table 4.4 and for per-capita IDBH in Table 4.5.74 


Results for PDB show that the speed of convergence remains statistically insignificant in 
all the specifications, including the variables proxying for the steady state, as was the case 
with absolute convergence. We find no evidence of conditional convergence using PDB data. 


In the case of IDBH, where we find evidence of absolute convergence, once we include 
variables X; proxying for the steady-state level of income, the speed of convergence turns 
insignificant. We find no evidence of conditional convergence using IDBH data. 


23 As the saving rates that are available from CEGA (2006b,a) include values that are negative, we add a constant 


to all values, so that the transformed data are all positive and we can compute the logs. 
24The number of departments included depends upon data availability. 
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4.5.4 Beta-Convergence Using Time-Series Cross-Sectional Data 


Recall that with TSCS data, the regression is based upon Equation 4.6, defined in subsection 
4.3.4 as 


log(y;;) = Qi + pilog(vis—1) + uis, 


which uses the measure of relative income y;,, that is, income of each department expressed 
as the ratio to the national average. One may estimate the equation in several ways. First, we 
begin by pooling the data, assuming homogeneity in the parameters. Second, we use linear 
mixed models where the parameters are assumed to have a fixed component, common to all 
departments, and a random part. Third, we estimate 25 separate ordinary least squares (OLS) 
regressions for each entity. Finally, we assume that there are several groups of departments 
which share the same & and p, and explore this issue with finite mixture models. 

In all cases, the key issue is whether the estimated value for p is lower than 1, which 


would suggest that there is economic convergence. 


Pooled Data and OLS 


The assumption of о; = œ and р; = p V; in Equation 4.6 is equivalent to assuming that 
there is a common steady state to all departments. Hence, the results are comparable to 
those obtained using cross-sectional data when we tested for absolute beta-convergence in 
subsection 4.5.2. 

Tables 4.6 and 4.7 present the results for PDB and IDBH using TSCS pooled data and 
estimating with OLS. In both cases, the estimated p is less than 1 (0.989 for PDB and 0.986 
for IDBH). However, it must be noted that while the value 1 is not included in the 95 per- 
cent confidence interval of p for IDBH, it is included for PDB, confirming the evidence of 
absolute convergence in IDBH, but not in PDB. 

For IDBH the implied estimated speed of convergence В, computed with the estimated р 
value, is 1.4 percent, similar to and only slightly higher than the one observed using cross- 
sectional data (1.2 percent) in Section 4.5.2. 


Mixed Models 


We follow here a frequentist approach for the estimation of Equation 4.6. Following Maddala 
et al. (1997) and using matrix notation, we define 

log(¥i,1) l log(y;o) Uil 
z=-| : %=|: i LM u=| : |, 


log(y;,r) l log(yi,r-1) ит 
with i = 1, ..., №, where N is the number of regions in the data. 
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We consider the autoregressive regression model 
Z=Xb+U, i=1,...,N, (4.11) 


with the assumptions U; ~ №0, о? Г), and b; ~ N(u, E), where J is the identity matrix and X 
is a nonzero covariance matrix.? We further assume that the U; are independent across the 
N equations, and that b; and U; are independent for different regions. 

We work with a linear mixed model (McCulloch and Searle, 2001). If we write b; as 
bi = U + 1j, with n; ~ N(0, E), we can rewrite Z;, (i = 1,..., N) as 


Zi = Х(и+т) +0; 
= Xp+Xnity; (4.12) 
= Xp+wi, (4.13) 


with w; ~ N(0, Q;), Q; being the variance covariance matrix defined as 
О; =X EX, +021. (4.14) 


In Equation 4.12, the vector u represents the fixed effects and 7; represent the random 
effects. In linear mixed models, fixed effects are used for modeling the mean of the response 
variable and the random effects are used to model the variance-covariance structure of it 
(McCulloch and Searle, 2001). The parameters in our linear mixed model are then u, 2, and 
оў. The last two parameters are in fact variance components, as presented in Equation 4.14. 

One can obtain an estimator for u and best-linear unbiased predictors for the random 
effects n; with maximum likelihood or restricted maximum likelihood (REML).? Here, we 
prefer REML for three reasons: (i) the estimators are based upon taking into account the 
degrees of freedom for the fixed effects in the model, (ii) because of its unbiasedness in the 
case of balanced panels, and (iii) as REML estimators seem to be less sensitive to outliers in 
the data."" With the obtained values for u and т}, one could compute predicted values for 
the N different о; and р;. 

We are interested in the estimation of the fixed effects. As was mentioned before, the lit- 
erature suggests that in some cases, the estimated speed of convergence can be substantially 
higher than the one obtained by assuming there are no random effects. We also compare 
the results with those assuming homogeneity in the parameters using likelihood ratio tests 
25The results of the estimation assume no special structure of the matrix X. 


"For the algorithms used for obtaining maximum likelihood and restricted maximum likelihood estimates in 


the case of a linear mixed model, see Pinheiro and Bates (2000). 
27For a review of linear mixed models and a discussion of the estimation with maximum likelihood and REML, 


see McCulloch and Searle (2001). 
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and the Akaike information criterion (AIC) in order to investigate if a more flexible model 
allowing for heterogeneity in the parameters should be preferred. 

Results for PDB are presented in Table 4.8. The estimated coefficients for the fixed ef- 
fects are similar to the coefficients estimated when assuming homogeneity in the parameters 
(Table 4.6). In the case of p, the estimated value for the linear mixed model is 0.984, close to 
the value 0.989 obtained with OLS and assuming no random effects. It must be noted that the 
standard error of the fixed effect of p is higher than for the coefficient estimated in the model 
assuming homogeneity in the parameters. The estimated standard deviations of both random 
effects are quite low, especially the one for œ, with a value close to 0, suggesting there is 
no evidence of different steady states. The value for the Akaike information criterion for the 
linear mixed model is larger than for the simpler model, assuming parameter homogeneity, 
and hence the simpler model is preferred. This is also corroborated by a likehood ratio test. 

Table 4.9 shows the results for IDBH. Once again, the coefficients for the fixed effects 
are close to the ones obtained with the model in the previous section, in which we assumed 
parameter homogeneity (Table 4.7), with p equal to 0.986 in both cases. The estimated 
standard deviations of both random effects are low, in particular the one for œ, which is 
close to 0, giving no support for the existence of different steady states. The AIC suggest 
that the simpler model is better, which is confirmed with a likelihood ratio test.?8 


Separate Regressions for Each Department 


We also treat all departments as separate entities and run an AR(1) regression for each one. 
These separate regressions shed light upon the effect of past values on current values, but 
due to the low amount of observations for each department (25 years), estimations are not 
reliable. In Table 4.12, we present results for PDB. The slope coefficient p is lower than 1 
for all departments but has large standard errors and is not significant at the 5-percent level 
for Cauca and Boyacá." The resulting speeds of convergence are implausibly high with 
values ranging from 10 to 60 percent in the case of PDB, a result influenced by the fact that 
the period only covers 25 years. Results for IDBH are similar (Table 4.13). 

The graphical analysis of each time series is more informative. In Figure 4.12, we plot 
the individual time series for relative PDB in logs for all departments. We observe that in 
almost all departments, the values change little over time and the series seem stationary. 
They remain either above or below the national average with the exception of Guajira and 
28 Although it is possible to calculate the implied speed of convergence for each department, the interpretation is 

difficult. For illustrative purposes, we present them in Tables 4.10 for PDB and 4.11 for IDBH. The associated 


speeds of convergence have a larger variability for PDB than for IDBH. The average speed of convergence is 


1.6 percent for PDB and 1.4 percent for IDBH). 
??The expected value can be calculated when |p| < 1, i.e. when the time series is stationary. 
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Nuevos. The time series do not become closer to the national value over time, except for 
Guajira, indicating a lack of economic convergence among departments. 

Results for IDBH (Figure 4.13) show that most of the time series seem stationary. Inter- 
estingly, the wealthiest department, Bogotá, moves slightly closer to the national average, as 
does as the poorest department, Chocó. Guajira, although becoming closer to the national 


average, still remains below it. 


Mixture Models 


In the previous sections, we estimated a model assuming that & and p are the same for 
all departments. We then allowed these parameters to be different for each department, 
in the context of a linear mixed model, where the parameters are assumed to have a fixed 
component, common to all departments, and a random part. Then, we estimated 25 separate 
AR(1) regressions, one for each department. 

Another possibility is that there are several groups of departments which share the same a 
and p. We explore this possibility with a finite mixture model, as described in Leisch (2004) 
and Grün and Leisch (2008). These types of models can be applied, assuming that observa- 
tions originate from various groups, where the group affiliations are unknown. Finite mixture 
models with a fixed number of components are estimated with the expectation-maximization 
(EM) algorithm within a maximum likelihood framework. 

We assume three groups and fit the model with the statistical software R (R Development 
Core Team, 2008) and the package flexmix (Leisch and Grün, 2008). Results for PDB and 
IDBH are presented in Tables 4.14 and 4.15. We show estimated œ and p for each group of 
departments, as well as the departments composing each group. 

Results for PDB (Table 4.14) show that Group 1 includes many of the poorest departments 
(e.g., Chocó, Sucre, Nariño, and Córdoba), Group 2 is composed of Nuevos Departamentos 
and La Guajira, and Group 3 includes the richest departments (e.g., Bogotá, Valle, and Antio- 
quia). Estimated values for œ and р are similar for Groups 1 and 3, with & being negative 
and close to 0 and p being close to 0.99, a result that is similar to the estimated value ob- 
tained in subsection 4.5.4, assuming homogeneity in the parameters. The implied speed of 
convergence for these two groups is close to 1 percent. If one believes in the validity of the 
estimated expected value of the time series, one would expect that departments belonging to 
Group 1 would remain well below the national average over time, while those from Group 
3 would remain below, as well, but would be closer to it. As was discussed before, Nuevos 
Departamentos and La Guajira experienced high growth rates between 1975 and 2000, as- 


30Mixture models are only identifiable up to a permutation of the component labels (Leisch, 2004). The names, 
Group 1, Group 2, etc., have no special meaning here, and the order of the groups is irrelevant. 
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sociated with the production of oil and coal. The model captures this, suggesting that both 
departments are far from their steady states, showing a large implied speed of convergence 
(10 percent), and predicting that both would remain above the national average. 

Concerning IDBH (Table 4.15), the grouping of departments is similar as above, with 
Group 1 including many of the poorest departments and Group 3 including the richest ones. 
Group 2 now includes Nuevos Departamentos, La Guajira, and Sucre. Groups 1 and 3 have 
values for the estimated о that are quite similar to one another, and close to 0. Values for the 
estimated p are also similar with 0.98 for Group 1 and 0.99 for Group 3, both being close to 
the estimated value obtained, assuming homogeneity in the parameters (Subsection 4.5.4). 
Nuevos Departamentos, La Guajira, and Sucre have values for & and p that are different than 
those from the other two groups (-0.01 for œ and 0.96 for p). Once again, the model suggests 
that these departments are far from the steady state, with an implied speed of convergence of 
4 percent, which speed is greater than that for Groups 1 (2 percent) and 3 (1 percent). Once 
again, with an analyzed time period of only 25 years, it is questionable whether one should 
rely upon the estimated expected values. 


4.5.5 Kernel Density Estimators 


All the results for kernel density estimations were computed with the statistical software R 
(R Development Core Team, 2008) and the package ks.3! For both univariate and bivariate 
kernel density estimations, we use Gaussian kernels and smoothed cross validation band- 
width selectors?? (Jones et al., 1991; Duong and Hazelton, 2005). In the univariate case, and 
as suggested by Bowman and Azzalini (1997), we use for both years the same bandwidth, 
which is computed as the mean of the two selected bandwidths obtained for each year sep- 
arately. In the bivariate case, the smoothed cross validation is unconstrained, meaning that 
we do not impose that the (nonsingular) bandwidth matrix H has to be diagonal in Equation 
4.10. Hence, we are able to handle correlation between components, as we allow kernels 
to have an arbitrary orientation (Wand and Jones, 1995). As we are especially interested in 
checking whether a large portion of the probability mass remains clustered around the 45- 
degree diagonal, this flexibility is relevant for us. If we were to impose a diagonal matrix H, 
only kernels which are oriented to the coordinate axes would be allowed. 

Univariate kernel density estimations of the logarithm of relative departmental PDB for 
the years 1975 and 2000 are shown in Figure 4.16. Both densities seem unimodal and are 


31ks is currently the most comprehensive kernel density estimation package in R (Duong, 2008). All the 


estimations were done with the function kde. 
32We have also tried direct plug-in methods as suggested by Sheather and Jones (1991) and obtained results 


that are not very dissimilar. 
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very similar. Thus, according to this figure, there were almost no changes in the distribution. 
Table 4.16 shows the results of a formal bootstrap test of equality of the estimated densities 
for both years. We cannot reject the hypothesis that the two densities are identical. Bivariate 
kernel-density estimators are presented in Figures 4.17 and 4.18. Both figures make clear that 
most of the mass is concentrated along the 45-degree diagonal and hence support persistence 
in the distribution. Departments with a relative PDB that was above (or below) average in 
the year 1975 tend to remain above (or below) average in 2000. Two interesting cases are 
La Guajira and Nuevos Departamentos, as they show some mobility. Nuevos Departamentos 
was close to the national average in 1975 and is clearly above the average in 2000, while La 
Guajira was clearly below the national average in 1975 and is quite close to it in 2000. 

Turning to results using the logarithm of relative departmental IDBH, Figure 4.19 presents 
the univariate kernel estimators for the years 1975 and 2000, showing a slight shift of the 
distribution to the right in 2000. The distribution narrowed between 1975 and 2000 and the 
two modes observed in 1975 in the left and right tails of the distribution almost disappeared 
in 2000. However, according to the bootstrap test of equality of the estimated densities for 
both years, we cannot reject the hypothesis that the two densities are identical (Table 4.16). 
Bivariate kernel-density estimators are presented in Figures 4.17 and 4.18. Bivariate kernel 
density estimators in Figures 4.20 and 4.21 show some mobility, as well. In the contour 
plot (Figure 4.21), the mass of the distribution rotates slightly clockwise, suggesting mild 
convergence in the distribution. 


4.6 Conclusions 


Returning to the questions raised at the beginning of the study, we do not find absolute or 
conditional convergence in PDB using the regression approach. The distributional approach 
shows persistence in the distribution, i.e., relative to the average, each department remains 
in the position where it was located in 1975. Results of both methods point in the same 
direction — there is no convergence but persistence in PDB does exist. 

Analysis of IDBH shows absolute convergence using the regression approach. A fter test- 
ing different models allowing for parameter heterogeneity, we found that there is no evidence 
of the existence of different steady states. The pooled model using TSCS provides our pre- 
ferred estimators. Bivariate kernel density estimators show some improvements in the distri- 
bution. However, the changes are small and consistent with the low speed of convergence of 
around 1 percent. 

Different factors explain our results. Differences in geography, infrastructure, and pop- 
ulation density among departments are relevant factors to explain lack of convergence in 
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PDB, as are differences in production structures and value added by department. Excepting 
for the mining departments, the different production structures remained almost unchanged 
between 1975 and 2000 (Table 4.17).? However, mineral exploitation in Colombia is rel- 
atively recent, going back only to the mid eighties, and this fact explains why the group of 
Nuevos and the department of La Guajira are the only initial poor departments that grew 
more quickly than the wealthier departments, according to PDB data. Previous literature had 
already pointed to the fact that once the mining departments are excluded, any hint of con- 
vergence disappears (Birchenall and Murcia, 1997) and that departments with a high share of 
agricultural production had the lowest growth rates (Bonet, 1999). Three departments con- 
centrated at least 50 percent of PDB in both evaluated years: Antioquia, Bogotá, and Valle 
del Cauca. These three departments combined produced 65 percent of the manufacturing 
output in 1975 and 60 percent in 2000. The stability of the shares in other sectors is also 
remarkable, indicating departmental concentration and low mobility of production factors 


across the country. 


At least two of the assumptions of the Solow model, which is the usual theoretical frame- 
work for studying convergence, seem problematic for application to the Colombian case. 
First, the neoclassical model assumes mobility of factors, which is in this case constrained 
by geographic, climatic, and infrastructural issues, as well as by the internal conflict issue. 
For instance, several productive sectors periodically suffer from attacks by violent groups, 
not only on physical capital, but also human capital through kidnapping and extortion. Sec- 
ond, the assumption of constant returns to scale is an oversimplification that does not hold 
for all sectors in the economy. As has been argued by World Bank (2009b), while returns to 
scale in agriculture tend to be constant, those in manufacturing and services are increasing. 


The slow convergence observed in IDBH could be explained by recent redistributive poli- 
cies. Recall that per capita IDBH is a measure of households' income after subtracting taxes 
on property and rental income and net payments to the social security, and adding other net 
current transfers. We must also note that Colombia has experienced higher public spending 
in social sectors and infrastructure. Literature dealing with the direct link between conver- 
gence and public spending is scarce, but suggests that it affected the relative position of 
some departments, although not the distribution as a whole (Ardila Rueda, 2004), and that 
efficiency of public spending has been decreasing over time, mainly due to political interests 
and corruption. 


33Nuevos Departamentos increased its participation in the Colombian mining sector from 11 percent in 1975 
to 55 percent in 2000 (Table 4.17). 
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Summary of Results 
Per capita income measure used 
PDB IDBH 
Classical Approach: Convergence? 
Sigma No Yes 
Absolute Beta No Yes 
Conditional Beta Cross Sections No No 
Conditional Beta Pooled TSCS No Yes 
assuming homogeneity of parameters 
Distributional Approach 
Univariate Kernel Estimators Distribution Distribution 
Unchanged Unchanged 

Bivariate Kernel Estimators Persistence in Suggests slow 


the Distribution Convergence 


Note: Results for conditional beta convergence with TSCS data and for the distributional approach 


based on relative values, i.e., ratios to the national level. 


Increased social spending has also benefited from mining sector revenues which are dis- 
tributed across all departments through the fiscal system.?^ IDBH of mining departments is 
still very low and did not exhibit the high growth rates observed in PDB.35 One reason for 
this is that fiscal decentralization began in the late eighties and the reforms are thus still too 
recent to be fully evaluated. A second reason is that financial resources from mining sectors 
are not efficiently spent because of corruption and are not sufficient to compensate for the 
low starting point in income of these departments. Recall that in 1975, La Guajira was the 
second poorest department in Colombia and that a large part of its population is indigenous 
and poorly linked to the departmental economy. Previous research suggests that even if rev- 
340i] revenues are divided between direct and indirect revenues and correspond to about eight to 25 percent of 

total extracted crude oil income. Direct revenues are those given to producing departments, municipalities, 


and ports of exports basically to finance investment in social sectors, and account for about 76 percent of oil 


revenues. Indirect revenues are those distributed among non-producing departments (Hernández, 2004). 
35Producing departments are obliged to spend at least 50 percent of the received mining revenues on social 


investment until having achieved certain minimum thresholds for infant mortality, health care, education, 
water, and sanitation. Indirect revenues are distributed according to projects presented through territorial 
entities (Law 141 of 1994). 
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enues of coal exports in La Guajira were distributed efficiently and without any corruption- 
related loss (corruption levels seem to be particularly high in mining departments), IDB of 
that department would still be about 60 percent of national IDB in 2000 (Meisel, 20072). 

Although overall public spending has increased, the transfer system bears some disadvan- 
tages for poor departments. Evidence shows that after totaling all public revenue (not only 
that directed to social sectors), there is no fiscal equalization in Colombia and the system is 
regressive; wealthy municipalities have the highest shares of public funds. 

Two other issues have to be taken into consideration for interpreting the results of both 
PDB and IDBH. One is that in 2000, our last year of analysis, the country was experiencing 
a large economic crisis which affected public and private finances. Transfers from the central 
government were thus also affected by the crisis. A second issue is related to the domestic 
conflict. Between 1998 and 2002, violence escalated dramatically when the groups involved 
in the war were fighting one other for control of strategic areas. Sánchez and Palau (2006), 
who deal directly with this last issue, argue that decentralization policies, political and fiscal, 
affected the interests of armed groups and even strengthened them through the sharp increase 
in local resources. The higher political autonomy at the local level increased the ability of 
armed groups to intimidate politicians and to extract rents from public funds. Guerrillas 
relocated in strategic zones with greater levels of prosperity, the facility for processing illicit 
drugs, and an intimidated local population (Sánchez and Palau, 2006). 

One of the policy implications of this study is the necessity of monitoring the efficiency 
of social spending and enforcing decentralization policies so that a faster convergence in 
IDBH can be achieved. Concerning convergence in PDB, reallocation of productive sector 
resources is not easy to achieve and could yield to efficiency losses, but the state can, for 
example, encourage the accumulation of human capital and improve infrastructure in lagging 
departments, which would help attract investments in the long run. Additionally, it is crucial 
to find a way out of the internal conflict to foster factor mobility in Colombia, particularly in 
those areas without significant state presence. We consider it vital to have an explicit regional 
policy in Colombia to foster growth in departments lagging behind national averages, after 
conducting case studies to assess which policies could be most effective in each case. 

Finally, for monitoring convergence across departments in the future, it is essential to 
have consistent time series constructed under a single methodology. Unfortunately, the work 
done by CEGA for the period 1975 to 2000 was not continued for the years after 2000. Such 
a project is of high policy relevance for the country. 
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4.7 Tables 


Table 4.1: Colombia. Gross Domestic Product (Constant Million Pesos of 1994), Per Capita 


GDP and Population. 1980-2006. 


Year 


1980 
1981 
1982 
1983 
1984 
1985 
1986 
1987 
1988 
1989 
1990 
1991 
1992 
1993 
1994 
1995 
1996 
1997 
1998 
1999 
2000 
2001 
2002 
2003 
2004 
2005 
2006 


Source: Own calculations based on National Accounts and Census 2005, DANE 


GDP 
(million) 


40822304 
41846404 
42160220 
42820420 
44217404 
45475604 
48189708 
50775504 
52808848 
54544940 
56873928 
58222936 
60757528 
64226880 
67532864 
71046216 
72506824 
74994024 
75421328 
72250600 
74363832 
75458112 
76917224 
79884488 
83772432 
87727928 
93881688 


Per capita Growth 


GDP 


1503335 
1503069 
1476873 
1462737 
1472781 
1476748 
1533078 
1582200 
1611804 
1630958 
1666658 
1671462 
1710026 
1773819 
1832015 
1895088 
1904234 
1940536 
1923949 
1817821 
1846071 
1849177 
1861165 
1908947 
1977279 
2045484 
2162904 


rate 


-0.02 
-1.74 
-0.96 
0.69 
0.27 
3.81 
3.20 
1.87 
1.19 
2,19 
0.29 
2.31 
3,73 
3.28 
3.44 
0.48 
1.91 
-0.85 
-5.52 
1:55 
0.17 
0.65 
2.57 
3.58 
3.45 
5.74 


Population 


27154504 
27840636 
28546950 
29274176 
30023068 
30794424 
31433316 
32091720 
32763808 
33443488 
34124536 
34833548 
35530176 
36208244 
36862624 
37489664 
38076640 
38646044 
39201320 
39745712 
40282216 
40806312 
41327460 
41847420 
42367528 
42888592 
43405388 


Growth 
rate 


2.53 
2.54 
2.55 
2.56 
2.57 
2.07 
2.09 
2.09 
2.07 
2.04 
2.08 
2.00 
1.91 
1.81 
1.70 
1.57 
1.50 
1.44 
1.39 
1.35 
1.30 
1.28 
1.26 
1.24 
1:23 
1.20 
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Table 4.2: Beta Convergence Using Cross-sections and Non Linear Least Squares. Depen- 
dent Variable: Average Growth Rate of pc PDB 1975-2000. 


Robust HC3 
Variable Coefficient Std. Err. 95% conf. interval 
Intercept 0.1055 0.1259 -0.1548 0.3659 
B 0.0067 0.0108 -0.0155 0.0290 
В (%) 0.67 
Number of observations 25 
Adj.R-squared 0.0112 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
Source: HC3 standard errors calculated according to Davidson and MacKinnon (1993). 


Table 4.3: Beta Convergence Using Cross-sections and Non Linear Least Squares. Depen- 
dent Variable: Average Growth Rate of pc IDBH 1975-2000. 


Robust HC3 
Variable Coefficient Std. Err. 95% conf. interval 
Intercept 0.1533 0.0392 0.0721 0.2345 
B 0.0119 0.0039 0.0038 0.0200 
В (96) 1.19 
Number of observations 25 
Adj.R-squared 0.3514 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
Source: HC3 standard errors calculated according to Davidson and MacKinnon (1993). 


Table 4.4: NLLS Regressions. Dependent Variable Average Growth Rate of pc PDB. 1975-2000. 


Regressors 


constant 


speed of convergence B 


log (life expectancy 1975) 


log (literacy 1973) 


log (transformed saving rate) 


log (population growth + 0.05) 


log (net enrollment rate 1985) 


Number of observations 

R-square 

Adjusted R-square 

* p < 0.05, ** p < 0.01, *** р < 0.001 


(1) 

b/se 
0.1157 
(0.1306) 
0.0076 
(0.0114) 


24 
0.06 
0.02 


(2) 

b/se 
0.1534 
(0.4804) 
0.0401 
(0.0809) 
0.0402 
(0.1344) 
0.0355 
(0.0645) 
0.0017 
(0.0059) 
0.0378 
(0.0443) 


24 
0.31 
0.12 


(3) 

b/se 
0.3533 
(0.3071) 
0.0186 
(0.0250) 
-0.0092 
(0.0785) 


0.0047 
(0.0064) 

0.032 
(0.0373) 


24 
0.28 
0.13 


(4) 
b/se 


0.2753 
(0.2533) 
0.0226 
(0.0278) 


0.015 
(0.0367) 
0.0036 
(0.0086) 
0.0293 
(0.0371) 


25 
0.24 
0.09 


Note: HC3 robust standard errors (Davidson and MacKinnon, 1993) in brackets. 


(5) 

b/se 
0.2271 
(0.3062) 
0.0375 
(0.0622) 


0.0043 
(0.0128) 
-0.0009 
(0.0313) 
0.0342 
(0.0414) 


24 
0.31 
0.16 


vel 
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(1) (2) 
Regressors b/se b/se 
constant 0.1574*** 0.1272 
(0.0402) (0.2917) 
speed of convergence В 0.0123** 0.0282 
(0.0040) (0.0280) 
log (life expectancy 1975) 0.0063 
(0.0777) 
log (literacy 1973) 0.031 
(0.0309) 
log (transformed saving rate) 0.0007 
(0.0033) 
log (population growth + 0.05) -0.001 
(0.0224) 

log (net enrollment rate 1985) 
Number of observations 24 24 
R-square 0.40 0.58 
Adjusted R-square 0.37 0.47 


* p < 0.05, ** р < 0.01, *** р < 0.001 


(3) 

b/se 
0.335 
(0.1817) 
0.0135 
(0.0110) 
-0.0443 
(0.0485) 


0.0036 
(0.0046) 
-0.0079 
(0.0181) 


24 
0.53 
0.43 


(4) 

b/se 
0.1400 
(0.0962) 
0.0254 
(0.0126) 


0.0279 
(0.0161) 
0.0009 
(0.0021) 
-0.0037 
(0.0183) 


25 
0.56 
0.48 


Note: HC3 robust standard errors (Davidson and MacKinnon, 1993) in brackets. 


Table 4.5: NLLS Regressions. Dependent Variable Average Growth Rate of pc IDBH. 1975-2000. 


(5) 

b/se 
0.1233 
(0.1038) 
0.0262 
(0.0155) 


0.0019 
(0.0055) 
-0.0227 
(0.0127) 
0.0241 
(0.0171) 


24 
0.64 
0.56 
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Table 4.6: OLS Linear Regression. TSCS Data. Dependent Variable /og(y;;). Relative Per 
capita PDB. 1975-2000. 


Variable Coefficient Std. Err. 95% conf. interval 
Intercept -0.0023 0.0032 -0.0086 0.0040 
log(yis-1) 0.9891 0.0069 0.9755 1.0027 
Implied f 1.09% 
Number of observations 625 
R-squared 0.9730 
AIC -1632 


Source: Own calculations based on data from CEGA. Constant prices of 1994, 


Table 4.7: OLS Linear Regression. TSCS Data. Dependent Variable log(y;,). Relative Per 
Capita IDBH. 1975-2000. 


Variable Coefficient Std. Err. 95% conf. interval 
Intercept -0.0014 0.0017 -0.0048 .0020 
log(yis-1) 0.9862 0.0046 0.9771 0.9953 
Implied B 1.38% 
Number of observations 625 
R-squared 0.9867 
AIC -2183 


Source: Own calculations based on data from CEGA. Constant prices of 1994, 
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Table 4.8: Linear Mixed Model (REML). TSCS Data. Dependent Variable: log(y;,). Rela- 
tive Per Capita PDB. 1975-2000. 


Fixed effects 

Variable Coefficient Std. Err. 
Intercept -0.0027 0.0034 
log(yi,4—1) 0.9836 0.0083 


Random effects 


Standard deviation Estimate 
Intercept 6.1e-09 
log(yi,-1) 0.0163 
Number of observations 625 
Number of groups 25 
AIC -1606 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 


Table 4.9: Linear mixed model (КЕМІ). TSCS Data. Dependent Variable: log(y;.,). Relative 
Per Capita IDBH. 1975-2000. 


Fixed effects 

Variable Coefficient Std. Err. 
Intercept -0.0014 0.0022 
log(yi;1) 0.9859 0.0047 
Random effects 

Standard deviation Estimate 
sd(Intercept) 0.0000 

log(yit—1) 0.0026 

Number of observations 625 

Number of groups 25 

AIC -2155 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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Table 4.10: Implied Convergence Rates Using TSCS Data and Linear Mixed Models 
(REML). Per capita PDB. 1975-2000 


Department Intercept Slope Implied Expected 
a p B (96) Value 
Nuevos Departamentos -0.003 0.976 2.4 -0.111 
Antioquia -0.003 0.984 1.6 -0.168 
Atlántico -0.003 0.983 1.7 -0.156 
Bogotá D. C. -0.003 0.986 1.4 -0.197 
Bolívar -0.003 0.982 1.8 -0.149 
Boyaca -0.003 0.981 1.9 -0.140 
Caldas -0.003 0.981 1.9 -0.137 
Caqueta -0.003 0.990 1.0 -0.265 
Cauca -0.003 0.986 1.4 -0.193 
Cesar -0.003 0.986 1.4 -0.187 
Cordoba -0.003 0.989 1.1 -0.242 
Cundinamarca -0.003 0.983 1.7 -0.161 
Chocó -0.003 0.994 0.6 -0.483 
Huila -0.003 0.983 1.7 -0.162 
La Guajira -0.003 0.957 4.3 -0.063 
Magdalena -0.003 0.986 1.4 -0.194 
Meta -0.003 0.981 1.9 -0.143 
Nariño -0.003 0.988 1.2 -0.220 
Norte Santander -0.003 0.986 1.4 -0.192 
Quindío -0.003 0.977 2.3 -0.119 
Risaralda -0.003 0.982 1.8 -0.148 
Santander -0.003 0.982 1.8 -0.152 
Sucre -0.003 0.998 0.2 -1.095 
Tolima -0.003 0.983 17 -0.158 
Valle -0.003 0.985 1.5 -0.174 
Mean 1.6 
Median 1.7 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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Table 4.11: Implied Convergence Rates Using TSCS Data and Linear Mixed Models 
(REML). Per capita IDBH. 1975-2000 
Department Intercept Slope Implied Expected 


a p В (%) Value 
Nuevos Departamentos -0.001 0.986 0.0 -0.10 
Antioquia -0.001 0.986 1.4 -0.10 
Atlantico -0.001 0.986 1.4 -0.10 
Bogotá D. C. -0.001 0.986 1.4 -0.10 
Bolivar -0.001 0.986 1.4 -0.10 
Boyaca -0.001 0.986 1.4 -0.10 
Caldas -0.001 0.986 1.4 -0.10 
Caqueta -0.001 0.986 1.4 -0.10 
Cauca -0.001 0.986 1.4 -0.10 
Cesar -0.001 0.986 1.4 -0.10 
Córdoba -0.001 0.986 1.4 -0.10 
Cundinamarca -0.001 0.986 1.4 -0.10 
Chocó -0.001 0.987 1.3 -0.11 
Huila -0.001 0.986 1.4 -0.10 
La Guajira -0.001 0.985 1.5 -0.09 
Magdalena -0.001 0.986 1.4 -0.10 
Meta -0.001 0.986 1.4 -0.10 
Nariño -0.001 0.986 1.4 -0.10 
Norte Santander -0.001 0.986 1.4 -0.10 
Quindio -0.001 0.986 1.4 -0.10 
Risaralda -0.001 0.986 1.4 -0.10 
Santander -0.001 0.986 1.4 -0.10 
Sucre -0.001 0.987 1.3 -0.11 
Tolima -0.001 0.986 1.4 -0.10 
Valle -0.001 0.986 1.4 -0.10 
Mean 1.4 
Median 1.4 


Source: Own calculations based on data from CEGA. Constant prices of 1994, 


Table 4.12: Autoregressive Processes of Order 1. Dependent Variable: log(y;.,). Per Capita PDB. 1975-2000. 


Intercept (0) Slope (p) Regression Expected Implied 
Department Coefficient — Sdt.Error Coefficient. —Std.Error Adjusted R? Value B (99) 
Nuevos Departamentos 0.03 0.03 0.85 0.14 0.62 0.21 14.75 
Antioquia 0.06 0.03 0.65 0.16 0.41 0.16 35.48 
Atlántico -0.02 0.01 0.85 0.06 0.88 -0.15 14.99 
Bogotá D. C. 0.08 0.09 0.84 0.17 0.52 0.48 16.10 
Bolivar -0.05 0.02 0.69 0.15 0.48 -0.14 31.16 
Boyacá -0.09 0.03 0.29 0.21 0.08 
Caldas 0.11 0.03 0.39 0.18 0.17 -0.17 61.42 
Caquetá -0.16 0.08 0.76 0.13 0.58 -0.64 24.10 
Cauca -0.26 0.13 0.62 0.18 0.33 -0.68 38.31 
Cesar -0.07 0.04 0.85 0.10 0.77 -0.45 15.34 
Córdoba -0.18 0.09 0.77 0.12 0.66 -0.78 22.95 
Cundinamarca 0.01 0.00 0.87 0.13 0.68 0.06 12.51 
Chocó -0.37 0.17 0.67 0.16 0.45 -1.11 33.14 
Huila -0.11 0.05 0.66 0.16 0.43 -0.31 34.25 
La Guajira 0.01 0.03 0.90 0.06 0.92 0.06 10.36 
Magdalena -0.29 0.11 0.52 0.18 0.27 -0.59 48.31 
Meta -0.04 0.02 0.67 0.15 0.48 -0.11 32.80 
Nariño -0.30 0.14 0.63 0.16 0.40 -0.83 36.65 
Norte Santander -0.47 0.11 0.09 0.21 0.01 
Quindío -0.04 0.03 0.64 0.16 0.41 -0.12 36.26 
Risaralda -0.02 0.01 0.74 0.15 0.52 -0.07 25.89 
Santander 0.00 0.01 0.86 0.11 0.71 0.01 13.89 
Sucre -0.24 0.09 0.75 0.10 0.72 -0.96 25.32 
Tolima -0.09 0.05 0.71 0.15 0.51 -0.30 29.06 
Valle 0.05 0.05 0.80 0.20 0.41 0.23 19.61 
Mean -0.11 0.68 
Median -0.07 0.71 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
Note: Expected values (a /(1 – р)) calculated only for p significant at 5% level and lower tban 1. 
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Table 4.13: Autoregressive Processes of Order 1. Dependent Variable: /og(y;;). Per capita IDBH. 1975-2000. 


Intercept (a) Slope (p) Regression Expected Implied 
Department Coefficient — Sdt.Error Coefficient. —Std.Error Adjusted R? Value B (99) 
Nuevos Departamentos -0.02 0.02 0.71 0.17 0.44 -0.08 29.39 
Antioquia 0.02 0.01 0.86 0.11 0.74 0.11 14.22 
Atlántico -0.01 0.01 0.84 0.09 0.78 -0.07 15.57 
Bogotá D. C. -0.04 0.03 1.04 0.04 0.96 
Bolívar -0.06 0.03 0.83 0.10 0.76 -0.35 16.92 
Boyacá -0.04 0.03 0.60 0.23 0.23 -0.11 39.76 
Caldas 0.00 0.02 0.93 0.06 0.91 -0.01 6.81 
Caquetá -0.14 0.08 0.76 0.13 0.59 -0.58 23.54 
Cauca -0.02 0.06 0.95 0.12 0.75 -0.32 5.03 
Cesar -0.01 0.03 0.95 0.10 0.79 -0.19 5.30 
Córdoba -0.13 0.07 0.79 0.11 0.68 -0.60 20.87 
Cundinamarca -0.01 0.01 0.90 0.07 0.88 -0.07 9.80 
Chocó -0.08 0.12 0.92 0.10 0.78 -1.07 7.87 
Huila -0.09 0.04 0.68 0.13 0.52 -0.27 32.45 
La Guajira -0.07 0.05 0.86 0.07 0.86 -0.50 13.79 
Magdalena -0.03 0.07 0.95 0.12 0.73 -0.49 5.46 
Meta -0.05 0.02 0.71 0.10 0.67 -0.17 28.72 
Narifio -0.23 0.13 0.71 0.16 0.46 -0.82 28.68 
Norte Santander -0.05 0.05 0.87 0.10 0.75 -0.40 12.91 
Quindio -0.04 0.02 0.74 0.12 0.61 -0.15 25.57 
Risaralda -0.04 0.02 0.63 0.14 0.47 -0.11 36.51 
Santander -0.02 0.01 0.65 0.18 0.35 -0.05 35.48 
Sucre -0.19 0.09 0.74 0.13 0.60 -0.71 26.37 
Tolima -0.01 0.03 0.96 0.09 0.82 -0.27 3.53 
Valle 0.03 0.02 0.79 0.16 0.51 0.14 20.62 
Mean -0.05 0.82 
Median -0.04 0.83 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
Note: Expected values (a/(1 — p)) calculated only for p significant at 5% level and lower than 1. 


SATAVL Zt 


Im 


142 4. REGIONAL CONVERGENCE IN COLOMBIA: INCOME INDICATORS 


Table 4.14: Mixture Model with 3 Components. Fitted with ML. Dependent Variable: 
Іор (у: г). Relative Per capita PDB. 1975-2000. 


Department Group Intercept Slope Implied Expected 
a p BC) value 


Bolívar 1 -0.007 0.988 1.168 -0.618 
Воуаса 1 
Caldas 1 
Caqueta 1 
Cauca 1 

Cesar 1 
Cördoba 1 
Chocó 1 
Magdalena 1 
Meta 1 
Nariño 1 
Quindio 1 
Risaralda 1 
Sucre 1 

1 


Тошпа 


Nuevos Departamentos 0.015 0.900 9.986 0.153 


La Guajira 


2 

2 
Antioquia 3 -0.001 0.990 1.045 -0.139 
Atlantico 3 
Bogota D.C 3 
Cundinamarca 3 
Huila 3 
Norte Santander 3 
Santander 3 


Valle 3 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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Table 4.15: Mixture Model with 3 Components. Fitted with ML. Dependent Variable: 
log(yi;). Relative Per Capita IDBH. 1975-2000. 


Department Group Intercept Slope Implied Expected 
a p BO) value 


Bolivar 1 0.000 0.982 1.757 -0.004 
Caquetá 1 
Cauca 1 
Cesar 1 
Córdoba l 
Chocö I 
Magdalena 1 
Мапйо 1 
Quindío 1 
Nuevos Departamentos 2 -0.013 0.961 3.893 -0.325 
La Guajira 2 
Sucre 2 


Antioquia 3 -0.003 0.988 1.174 -0.239 
Atlántico 3 
Bogotá D.C 3 
Boyacá 3 

Caldas 3 
Cundinamarca 3 
Huila 3 

Meta 3 

Norte Santander 3 
Risaralda 3 
Santander 3 
Tolima 3 

Valle 3 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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Table 4.16: Bootstrap tests of equality of univariate density estimates, Income indicators 


Variable Years compared p-value 
Per capita PDB 1975, 2000 0.998 
Per capita IDBH 1975, 2000 0.844 


Test proposed by Bowman and Azzalini (1997). 1000 replications. 
Source: Own calculations based on data from CEGA. 


Table 4.17: Share of Total PDB and Share of selected PDB Sectors by Department. Years 1975 and 2000. 


Percentage of Agriculture Mining Manufacturing Financing Government Commerce 
total PDB 
Department 1975 2000 1975 2000 1975 2000 1975 2000 1975 2000 1975 2000 1975 2000 
Antioquia 15.3 144 122 13.9 213 40 21.0 18.0 14.1 157 15.5 10.9 184 169 
Atlántico 5.3 43 08 0.7 00 0.0 7.5 6.1 48 4.0 27 3.4 64 5.1 
Bogotá D.C 224 23.6 07 00 05 0.0 25.4 22.6 40.1 45.6 278 248 212 183 
Bolivar 4.0 4.1 33 29 23 09 4.8 5.6 2.0 1.8 34 29 4.1 6.3 
Boyacá 3.8 2.9 72 71 131 2.8 2.5 1.8 1.9 1.2 37 46 32 2.1 
Caldas 2.5 2.0 37 26 07 01 22 2.0 23 1.8 24 29 2.0 1.9 
Caquetá 0.5 0.5 1.2 1.6 00 00 0.1 0.0 0.1 0.2 07 09 02 02 
Cauca 1.8 1.9 40 3.3 05 0.2 1.0 2.2 09 0.7 23 3.0 15 23 
Cesar 1.6 1.5 37 3.0 00 5.5 0.6 0.4 13 0.5 1.2 1.8 15 09 
Chocó 0.4 0.3 0.5 1.0 60 03 0.0 0.0 02 01 12 07 02 02 
Córdoba 2.0 1.9 62 3.7 15 28 0.1 1.9 10 07 2.1 2.2 1.2 1.6 
Cundinamarca 4.8 5.7 84 127 7.5 1.2 42 8.2 1.6 1.4 47 74 52 86 
Guajira 0.4 13 06 06 3.6 123 0.1 0.0 04 03 06 08 03 01 
Huila 1.6 1.7 3.1 2.5 15 54 0.4 0.6 1.2 1.0 20 23 1.1 1.1 
Magdalena 1.8 1.5 36 34 19 0.0 0.4 0.3 1.1 0.7 1.8 1.9 10 08 
Meta 1.2 1.5 24 28 03 51 0.4 0.7 12 09 1.0 1.3 0.9 1.4 
Narifio 1.6 1.6 29 38 10 01 0.4 0.4 1.3 1.0 17 25 1.2 1.1 
Norte de Santander 2.0 1.8 3.1 3.3 53 0.6 0.9 0.7 1.7 1.4 29 27 1.3 1.0 
Nuevos 1. 4.7 24 39 10.7 55.0 0.1 0.1 1.2 1.0 20 2.9 05 0.8 
Quindío 1.2 1.1 22 23 00 00 0.6 0.3 1.1 1.1 1.4 1.2 12 06 
Risaralda 2.0 1.9 2.1 1.5 0.1 0.0 22 2.3 2.0 1.6 1.6 1.7 20 20 
Santander 5.2 5.0 62 69 18.9 2.1 5.2 6.5 38 3.5 40 53 50 77 
Sucre 1.0 0.7 3.0 1.6 03 0.1 0.2 0.2 06 03 1.0 09 06 04 
Tolima 3.2 2.6 70 6.0 0.9 1.3 1.1 1.9 2.3 1.6 4.1 3.1 27 27 
Valle 132 114 93 88 20 03 18.5 172 120 118 78 83 17.1 161 


Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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4.8 Figures 


Figure 4.1: Map of Colombia. 
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Figure 4.2: Box Plot: Log of Per Capita PDB. 1975-2000. 
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Source: Own calculations based on data from CEGA. Constant prices of 1994. 


Figure 4.3: Box Plot: Log of Per Capita IDBH. 1975-2000. 
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Figure 4.4: Box Plot: Log of Relative Per Capita PDB. 1975-2000. 
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Source: Own calculations based on data from CEGA. Constant prices of 1994. 


Figure 4.5: Box Plot: Log of Relative Per Capita IDBH. 1975-2000. 
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Figure 4.6: Sigma Convergence. GDP by Department. 


42 


Standard deviation (Variables in log) 


1990 1992 1994 1996 1998 2000 2002 2004 
Year 


1980 1982 1984 1986 1988 


—9— DANE 1980-1996 | —4— DANE 1990-2005 —%— DANE 2000-2005 


Source: Own calculations based on data from DANE. 


Figure 4.7: Sigma Convergence. Per Capita Gross Departmental Product (PDB) and Gross 
Personal Disposable Income (IDBH). 1975-2000. 
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Figure 4.8: Beta Convergence. Per Capita PDB. 1975-2000. 
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Figure 4.9: Beta Convergence without Nuevos, Chocó and Guajira. Per Capita PDB. 1975- 
2000. 
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Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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Figure 4.10: Beta Convergence. Per Capita IDBH. 1975-2000. 
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Source: Own calculations based on data from CEGA. Constant prices of 1994. 


Figure 4.11: Beta Convergence without Guajira. Per Capita IDBH. 1975-2000. 
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Figure 4.12: Log of Relative PDB by Department. 1975-2000. 
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Figure 4.13: Log of relative IDBH by department. 1975-2000. 
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Figure 4.14: Log of Relative PDB. All Departments. 1975-2000. 
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Figure 4.15: Log of Relative IDBH. All Departments. 1975-2000. 
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Source: Own calculations based on data from CEGA. Constant prices of 1994. 
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Figure 4.16: Univariate Kernel Density Estimators of Relative Per Capita PDB. Years 1975 
and 2000. Constant Prices of 1994. 


Density function 


-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 


Log of relative pc PDB 


Source: Own calculations based on data from CEGA. Variables in logs 


156 4. REGIONAL CONVERGENCE IN COLOMBIA: INCOME INDICATORS 


Figure 4.17: Relative Per Capita PDB Dynamics. Years 1975 and 2000. Constant Prices of 
1994, 


Source: Own calculations based on data from CEGA. Variables in logs 
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Figure 4.18: Relative per capita PDB Dynamics: Contour Plot. Years 1975 and 2000. Con- 
stant Prices of 1994. 
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Note: Contours are drawn at 30%, 60% and 90%. The points represent the 25 observations. Points outside the 
contour level curves are identified. A 45 degree line is added to the plot. 
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Figure 4.19: Univariate Kernel Density Estimators of Relative per Capita IDBH. Years 1975 
and 2000. Constant Prices of 1994. 
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Figure 4.20: Relative Per Capita IDBH Dynamics. Years 1975 and 2000. Constant Prices of 
1994. 


Source: Own calculations based on data from CEGA. Variables in logs 
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Figure 4.21: Relative per Capita IDBH Dynamics: Contour Plot. Years 1975 and 2000. 
Constant Prices of 1994. 
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Source: Own calculations based on data from CEGA. Variables in logs 
Note: Contours are drawn at 30%, 60% and 90%. The points represent the 25 observations. Points outside the 
contour level curves are identified. A 45 degree line is added to the plot. 


Essay 5 


Regional growth convergence in 


Colombia using social indicators 


Abstract 


This paper investigates convergence in social indicators among Colombian departments from 
1973 to 2005. We use census data and apply both the regression approach and the distribu- 
tional approach (univariate and bivariate kernel density estimators). Using literacy rate as 
a proxy for education, we find convergence between 1973 and 2005, but persistence in the 
distribution between 1975 and 2000, when we use the infant survival rate and life expectancy 
at birth as proxies for health. Additionally, using data from Demographic and Health Sur- 
veys, we find some evidence of convergence in the rate of children that are well-nourished 
between 1995 and 2005. 


Based on joint work with Adriana Cardozo. 
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5.1 Introduction 


The majority of studies on convergence use macroeconomic aggregates to study whether 
poor countries are catching up with wealthier countries, and are particularly interested in the 
speed at which this process occurs, when it does occur. A large part of the empirical analysis 
of convergence is based on the neoclassical model developed by Solow (1956) and the esti- 
mation procedure suggested by Barro and Sala-i-Martin (1992a). Most of these studies use 
cross-sectional regressions based on per capita gross domestic product (GDP) to investigate 
if poor regions have higher rates of growth as they develop than wealthier areas. 

In recent years, some authors have also tested if there is convergence in living standards, 
given that it is well-being that really matters, and arguing that per capita GDP is not the 
appropriate indicator for it.! For example, Neumayer (2003) and Kenny (2004) conclude 
that even in the absence of convergence in per capita GDP, there is convergence in living 
standards among poor and rich countries, a phenomenon praised by Neumayer as one of the 
greatest success stories of development in the last century. 

This argument seems to be valid for cross-country analysis, but few studies exist investi- 
gating regional convergence in living standards in developing countries. Within a particular 
country, convergence analysis is important, not only to focus development assistance towards 
regions lagging behind in economic growth, but also to evaluate the efficiency and scope of 
public policies. Lack of convergence and the persistence of inequalities inside a country can 
lead to political instability, social unrest, and violence, given that people are concerned not 
only with their own improvements, but also with meeting the living standards of the wealth- 
iest regions (Kenny, 2004). In Colombia, high regional inequality in living standards is one 
of the possible underlying causes of the ongoing domestic conflict, which is fueled by drug 
trafficking and corruption, particularly in isolated regions with a low level of governmental 
presence and a high incidence of poverty. 

Over the past 30 years, Colombia has witnessed important reforms in social areas, partic- 
ularly in health care and education. These reforms were marked by increased governmental 
intervention during the mid seventies, such as the creation of a national health care system 
and vaccination campaigns to eradicate tropical diseases like malaria, and, in the early eight- 
ies, policy formulations to reduce mortality and increase primary education. At the beginning 
of the nineties, decentralization accelerated in order to reduce the fiscal burden on the central 
government, making municipalities more responsible for generating and administrating their 
own resources, and partially moving the provision of health services to the private sector 
(Hernández and Obregón, 2002). 


!Tn this paper, we will refer indistinctly to non-income indicators, social indicators, quality-of-life variables, 
and living standards. 
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The link between decentralization policies and the improvement in living standards in 
Colombia is a topic of extensive debate. While some researchers argue that the health poli- 
cies have categorically failed, others conclude that they have been successful, although rec- 
ognizing that there is still a large margin for improvement, particularly given that the system 
in place is still too recent to make a general judgement (Homedes and Ugalde, 2005; Barrera 
and Domínguez, 2006). Evidence shows that even though social spending increased consid- 
erably and reforms were designed to compensate municipalities with weak fiscal capacities, 
fiscal equalization has not been achieved. 

Although Colombia has experienced improvements in living standard indicators at the 
national level, particularly those related to health, education, and access to public services, 
there is evidence of a heterogeneous distribution of improvements. Thus, investigating re- 
gional convergence in social indicators is relevant, as it is a relatively under-researched topic. 
Out of 20 research documents about regional convergence in Colombia produced in the last 
15 years (Aguirre, 2008), only two published articles that deal with convergence in non- 
income indicators (Aguirre, 2005; Meisel and Vega, 2007). 

The objective of this study is to analyze whether departments that were lagging behind 
in social indicators in the 1970s were able to catch up by the year 2000. We use both the 
regression approach suggested by Barro and Sala-i-Martin (1991), among others, and the 
distributional approach pioneered by Quah (1997) to test for convergence in three variables, 
based on census data. Using literacy rate as a proxy for education, we find regional conver- 
gence between 1973 and 2005, but persistence in the distribution between 1975 and 2000 
when we use the infant survival rate and life expectancy at birth as proxies for health. Ad- 
ditionally, using data from Demographic and Health Surveys (DHS), we find convergence 
between 1995 and 2005 in the percentage of children that are well-nourished. 

This paper is divided into six sections. After this introduction, we argue in section 5.2 
that it is fundamental to focus upon indicators other than income and that the analysis of 
convergence at the department level is relevant for Colombia. We present two methods to 
empirically test for convergence in section 5.3. Section 5.4 describes the data used and the 
details of our empirical estimation. Results are presented in section 5.5 and discussed in 


section 5.6. 


5.2 Motivation 


The importance of analyzing social indicators, instead of focusing only upon income, has 
been extensively discussed in the work of Amartya Sen, who argues that social opportunities 
are one of the five types of instrumental freedoms that contribute to the overall freedom 
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people have to live as they choose (Sen, 1999). Social opportunities are understood to be the 
arrangements that a society makes which influence the individual's substantive freedom to 
live better, such as providing education and health care, which are important, not only for 
the the citizens in the conduct of their private lives, but also for more effective participation 
in societal economic and political activities. 


Public policies have a crucial influence on social opportunities (e.g., they can influence 
longevity through epidemiological policies and education through the provision of the corre- 
sponding facilities). Thus, it is important to shift attention to elements that affect individual 
well-being and freedom but which are not captured by income statistics (Sen, 1997). 

Particular attention shall be given herein to health status when analyzing social inequal- 
ities. Sen (1998) argues that mortality is a key economic indicator, given that it mirrors the 
success or failure of a society. Indicators like the infant survival rate, which responds very 
rapidly to public health polices, are central to that kind of analysis (Mazumdar, 2003). Along 
the same lines, indicators referring to the level of nutrition reflect one of the most basic needs 
for survival, namely access to adequate food (Sen, 2002). 

In Colombia, awareness of the influence and scope of the government on health and ed- 
ucation gradually increased in the second half of the 20 century, and has translated into 
programs, as well as legislation, aimed at achieving universal access to health care and pri- 
mary education. Policies for both sectors experienced important changes in the last quarter 
of the century, reflecting the transition from a centralist system to a locally managed one. 

In the seventies, polices on education focused upon reducing illiteracy and increasing the 
coverage of primary education, particularly in rural areas, where public schools and teachers 
were almost nonexistent.? However, rural areas continued to lag behind. Adult education and 
literacy campaigns continued through the eighties, together with the expansion of secondary 
and pre scholar education. 

Due to macroeconomic imbalances at the end of the eighties, the government initiated 
a decentralization program to reduce the financial burden on the central government, and 
transferred substantial revenues and responsibilities to local administrations. The country 
has been praised as a leading example of fiscal decentralization, which policy had the ob- 
jective of increasing social expenditures and efficiency in social sectors, as well as finan- 
cially compensating territories with weak fiscal capacities. This process accelerated with the 
new constitution in 1991 (Rojas, 2003; Barrera and Domínguez, 2006). As a result, social 
spending increased from seven to 15 percent of GDP between 1991 and 2001. Concerning 
education, evidence shows that reforms were beneficial for the urban sector, but less so for 


2With society facing limited resources, secondary and tertiary education was left largely to the private sector 
under the argument that spending in public universities would favor middle and upper income groups. 
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the rural sector, which lags behind the former on the issues of quality of education and net 
enrolment rates, particularly for secondary education (Velez et al., 2003). 


Concerning health policies, by the mid seventies, the government had increased its role 
as a provider of health care and had implemented programs to improve neonatal care and 
nutrition, eliminate tropical diseases through mass vaccinations, and promote reproductive 
health. At the same time, coverage of public services in the country expanded. An important 
result of these campaigns was the reduction in infant mortality, due in part to better access to 
drinking water. This reduction led to higher life expectancy at birth (Profamilia, 2005). 


In the nineties, the policy orientation shifted; the government increased the role of the pri- 
vate sector as health provider and tried to strengthen the national health care system through 
the decentralization of services at the local level. Implementation of a dual system, com- 
bining the contribution of formal sector employees with subsidies for the population outside 
the system, yielded an increase in health coverage, which attained 58 percentage in 2000 
(Hernández and Obregón, 2002). 


Statistics on health and education at the national level show significant improvements 
in the last quarter of the 20 century, and indicate that the country is a successful case 
within Latin America. Colombia ranks well within the Latin American average in many 
social indicators, or even slightly above. As an example, in 1970, the average adult literacy 
rate was 78 percentage and increased to 93 percentage in 2005, which is higher than the 
Latin American average of 90 percentage (World Bank, 2008)? However, these gains do 
not seem to have been homogeneously distributed across society. Inequality is anchored 
within different levels - between departments (the main administrative units), between urban 
and rural areas, and also inside urban areas. Given that Colombia is a land of contrasts, 
where one can find well-developed modern metropolitan areas with reasonable infrastructure 
and large underdeveloped low density areas, lacking even basic infrastructure, looking at 
variables at the national level clearly masks differences. 


Colombia is divided into 32 departments and the capital district of Bogotá, a division 
that has existed formally since the 1991 constitutional reform, when nine former intendan- 
cies and commissariats, sparsely populated areas, were acknowledged as departments (Ama- 
zonas, Arauca, Casanare, Guainía, Guaviare, Putumayo, San Andrés y Providencia, Vaupés, 


?Similarly, in 1980, live expectancy at birth was 66 years, and in 2005, it reached 73 years, which is the average 
for the region. Infant mortality for each 1,000 live births fell from 68 deaths in 1970 to 17 in 2005, compared 


to a regional average of 23 (World Bank, 2008). 
^As an example of inequality, the income Gini coefficient, which had modest decreases in the eighties, reached 


0.56 in 2004, almost as high as the value for Brazil for the same year. 
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and Vichada).) Departments are important political entities in Colombia, with elected local 
governments and separate department assemblies. Hence, it is relevant to analyze the per- 
formance in social indicators by department, investigating whether those that were lagging 
behind in the early seventies have been able to catch up in the year 2000. 

In this paper, we test the hypothesis that among Colombian departments, there was con- 
vergence in social indicators in the period from 1973 through 2005. This hypothesis is based 
on the following facts. First, starting in the mid seventies, national policies aimed at re- 
ducing illiteracy and improving neonatal care were put into place. Second, decentralization 
reforms were conducted, starting in the late eighties, to increase the efficiency of spending 
on education and health care. Third, after decentralization, social spending, as a share of to- 
tal gross domestic product, doubled. Fourth, as will be discussed later, social indicators are 
naturally bounded, which fact facilitates convergence if better-off departments are close to 
the upper bound in the initial year. Finally, empirical results presented in Essay 4, using per 
capita gross disposable income? 
Colombian departments between 1975 and 2000. 


We explain in the next two sections the methods used to test for convergence and the 


as a proxy for well-being, suggest slow convergence among 


variables included in the analysis. 


5.3 Methods for Measuring Convergence 


We will consider two alternatives to empirically test for convergence. The first one is the re- 
gression approach (Magrini, 2004), also called the classical approach to convergence analy- 
sis (Sala-i-Martin, 1996), which is the most frequently used analysis in the literature. Robert 
Barro and Xavier Sala-i-Martin are among the best known authors associated with it (Barro 
and Sala-i-Martin, 1991, 1992a,b, 2004; Sala-i-Martin, 1996). The second alternative is the 
distributional approach to convergence, pioneered by, among others, Danny Quah (Quah, 
1993a,b, 1996, 1997). Both approaches are presented briefly in this section. 

Within the classical approach to convergence analysis, the concepts of beta-convergence 
and sigma-convergence are relevant. Beta-convergence is related to the mean-reversion of 
the variable of interest. This is typically done by regressing the average growth rate of the 
variable of interest at the initial level.’ If the regression coefficient is negative and statis- 


5 As it is often done in studies with Colombian data at the department level due to lack of data, we group them 


as one unit under the name Nuevos departamentos. 
6IDBH is the abbreviation, based on the Spanish denomination Ingreso Departamental Bruto disponible de los 


Hogares (CEGA, 2006b). 
TTo be precise, we are discussing here absolute or unconditional beta-convergence, which assumes that regions 


are structurally similar. 
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tically significant, this means that the variable tends to grow more quickly in regions that 
lagged behind at the beginning of the period considered. 


A reduction over time in the dispersion of the variable of interest across entities (in our 
case, departments) indicates a more equitable distribution and is known as sigma-convergence. 
Testing for sigma-convergence is performed by checking the evolution of the standard devi- 
ation over time, or the coefficient of variation if the mean of the variable changes. The exis- 
tence of beta-convergence tends to generate sigma-convergence. Beta-convergence is a nec- 
essary, but not a sufficient condition for observing sigma-convergence. Sigma-convergence 
is an indicator of dispersion of departments, but does not tell much about the mobility of 


each one. 


Quah (1997) criticizes the classical approach to convergence analysis, arguing that nei- 
ther beta-convergence nor sigma-convergence can deliver useful answers to the question of 
whether poor countries or regions are catching up to wealthy ones. Quah argues that the 
classical approach does not provide any information about mobility, stratification, or polar- 
ization, and suggests analyzing the distribution dynamics directly. One alternative proposed 
by him is to work with a sequence of distributions and, after discretizing the space of val- 
ues, to count the observed transitions into and out of the distinct cell values and construct a 
transition probability matrix (Quah, 1993a,b). 


Quah (1997) warns, however, that a discretization could distort dynamics if the underlying 
observations are indeed continuous variables. He therefore suggests not to discretize at all, 
and rather to think of the distinct cells as tending to infinity and to the continuum, with the 
transition probability matrix tending to a matrix with a continuum of rows and columns, i.e., 
becoming a stochastic kernel.® In particular, the proposed methodology is based upon track- 
ing the evolution of the entire cross-section distributions across regions over time through 
the estimation of kernel densities for ‘relative’ variables. By relative variables, we mean 
that the variables of interest are expressed as relative to the national average, which allows 


abstraction from changes in the mean when we look at how the distribution changes. 


Empirically, in a graph showing how the cross-sectional distribution of the relative vari- 
able of interest changes between two periods, if most of the mass of the estimated bivariate 
kernel density is concentrated along the 45-degree diagonal, then regions basically remain 
where they started. We will refer to this situation as persistence in the distribution of the 
relative variable of interest. 


8For a technical derivation of a stochastic kernel, see Quah (1997, section 4). 
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5.4 Data and Empirical Estimation 


The convergence analysis in our paper is done at the department level. Using two cross- 
sections per variable, we treat each department as an observation and do not use population 
weights. We are interested in investigating whether departments that were lagging behind 
were able to catch up, and consider this to be a pertinent question in the Colombian case 
where, as mentioned in section 5.2, departments are important political entities. 

In this section, we deal briefly with the selection of variables, the transformation of the 
variables needed in some cases, and the particular choices we apply for the empirical esti- 


mation. 


5.4.1 Data 


As discussed by Micklewright and Stewart (1999), many quality-of-life variables have a 
complement (e.g., the infant survival rate is 1,000 minus the infant mortality rate). They 
warn that sigma-convergence results may depend upon whether one uses a variable or its 
complement. Kenny (2004) argues for measuring convergence towards a maximum and not 
towards zero, claiming that the latter approach favors small absolute changes, close to zero, 
above large absolute changes, further from zero. Additionally, he claims that convergence 
towards the maximum (i.e., a positive value) is what the majority of the literature on global 
trends does. We follow these arguments and use in this study 'positive' variables. By this we 
mean that we transform the variables so that they are, in theory, positively correlated with 
living standards. 

We have tried to obtain data at the department level for years close to 1975 and 2000, 
as we additionally want to compare our regional convergence results with those found for 
two income measures in Essay 4 for the period from 1975 to 2000. Evaluating a period of 
25 years to investigate convergence seems reasonable as the time span roughly represents a 
generation. Our main source of data at the department level is DANE.” It kindly provided 
illiteracy rate data that were computed from information obtained in censuses for the years 
1973 and 2005. For health data, we obtained infant mortality rates and life expectancy at 
birth. Data for the year 1975 are from DANE (1990) and for 2000, from DANE (2007). As 
explained before, we transform two of the variables so that they are *positive' - we work 
with literacy rates instead of illiteracy rates, and with infant survival rates instead of infant 
mortality rates. 


?Departamento Administrativo Nacional de Estadística. DANE is the official statistical agency in Colombia 
(http: //www.dane.gov.co/. 
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The literacy rate (LIT) is the percentage of literate population above age 5. Being illiterate 
can be considered a deprivation of a very basic capability. As argued by Sen (1999), basic 
education can additionally be considered a semi-public good, as it is not only the literate 
person who benefits from it, but society in general, for example through the reduction in 
fertility and mortality. 

The infant survival rate (ISR) is the number of babies that survive until their first birthday 
out of every 1,000 live-born babies during a particular year. It is a measure of nutrition and 
hygiene in the first months of life, and also reflects the degree of the existence of conta- 
gious diseases (Mazumdar, 2003). There are many empirical studies that show that women's 
education and literacy tend to increase the survival rates of children (Sen, 1999). 

Life expectancy at birth (LEX) is the average number of years that a newborn is expected 
to live if current mortality rates continue to apply. This variable reflects the level of health 
care, nutrition, and income. However, at least at the cross-country level, some studies show 
that the connection between income and life expectancy works mainly through two channels, 
public expenditure on health care and the success of poverty eradication efforts (Sen, 1999). 

In addition to the data provided by DANE, we use data from Demographic and Health 
Surveys for Colombia for the years 1995 and 2005 containing information about child nutri- 
tion at the department level. As this variable covers a shorter time period than the other three 
variables and is based upon data that are not always representative at the department level, 
we will treat the results based on the ‘positive’ variable, well-nourished rate, carefully. 

Our well-nourished rate (WR) is defined as 100 minus the percentage of children which 
are underweight. Underweight means insufficient weight for age and is commonly used as 
a summary indicator of undernutrition (UNICEF, 1998). Undernutrition depends upon both 
food intake and the ability to make nutritive use of it, which ability is influenced by general 
health conditions that depend on health care and public health provisions (Dréze and Sen, 
1989; Sen, 1999). 


5.4.2 Empirical estimation 


As the mean of each variable of interest has increased in the period considered, to test for 
sigma-convergence we look at the standard deviation in the initial period and in the final 
period. 

For testing for beta-convergence, we follow one of the estimations used by Bloom and 


Canning (2007). We run regressions as 


yi = а + Вх: + €i, (5.1) 
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where x; is the initial value of the variable of interest for department i, and y; is the change in 
the variable of interest in the period considered.!° We assume that £; ~ N(0, 07) and estimate 
the regressions with ordinary least squares (OLS). We use HC3 robust standard errors, as 
proposed by Davidson and MacKinnon (1993), to account for possible heteroscedasticity, 
considering that the number of observations is relatively small (Long and Ervin, 2000). We 
are interested in checking whether the estimated coefficient B is negative and statistically 
different from 0 at the 5 percentage level, meaning that lower initial levels of the variable 
of interest are associated with larger improvements in the periods considered. To check 
whether the results are robust, after the regression, we compute Cook’s distance to detect 
observations that have an unusual influence or leverage, and re-run the regressions on the 
restricted sample, excluding those observations. 

For the distributional approach, all variables are expressed relative to the Colombian 
value, as was explained in section 5.3. We additionally take the logarithm of the relative 
variable, as it facilitates the comparison to the national level. Expressed in logs, a relative 
value that is equal to 0 means that the department has the same value as the country, while 
a value that is, for example, equal to -0.05 means that the value of that department is 5 
percentage lower than the national value. 

Before we define how we proceed to test for convergence using the distributional ap- 
proach, we briefly present some concepts needed for our estimation.!! 

A univariate kernel density estimate can be regarded as a generalization of a histogram. It 
has the form 


Ala) = ar (58). (5.2) 


i=] 


where к is a kernel!?, Л > 0 is the bandwidth, also called the smoothing parameter, and л is 
the number of observations. 


In the context of convergence, we are interested in checking whether we find unimodality 
or multimodality in the estimated univariate densities of the relative variable of interest in 
both periods, and in determining how the estimated densities changed. 


l0We have also tried a specification proposed by Barro and Sala-i-Martin (1992a) that uses the average growth 
rate as the dependent variable and a function of the logarithm of the initial value as a regressor, obtaining 


similar results. 
ПА review of the statistical principles of univariate and multivariate kernel density estimation can be found in 


Härdle et al. (2004), for example. 

!2Kernel refers to any smooth function satisfying the conditions x(q) > 0, f x(g)dq = 1, f qk(g)dq = 0, and 
o? = f q*«(q)dq > 0 (Wasserman, 2006). In kernel density estimation, the choice of the kernel does not 
have a large impact on the estimation, but the choice of the bandwidth does. 
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Bivariate kernel density estimation requires two-dimensional data and a two-dimensional 
kernel. Here, О = (Q1, О)", and the kernel К maps R? into R+. The estimate has the form 


Ful) = ma} (5.3) 


where K is a bivariate kernel function, H is a symmetric bandwidth matrix, and n is the 
number of observations. 

For the analysis of convergence, we estimate the bivariate kernel density for the relative 
variable in two periods and check whether a large portion of the probability mass remains 
clustered around the 45-degree diagonal, which would indicate persistence in the distribu- 
tion. We present the 3D representation of the estimated bivariate density and a contour plot 
showing the highest density regions. 

All the results for kernel density estimation presented in section 5.5 were computed with 
the statistical software R (R Development Core Team, 2008) and the package ks.!? For both 
univariate and bivariate kernel density estimation, we use gaussian kernels and smoothed 
cross validation (SCV) bandwidth selectors!* (Jones et al., 1991; Duong and Hazelton, 
2005). As recommended by Bowman and Azzalini (1997), in the univariate case we use 
for both years the same bandwidth, which is computed as the mean of the two selected band- 
widths obtained for each year separately. In the bivariate case, the smoothed cross validation 
in unconstrained, i.e., we do not impose the requirement that the (nonsingular) bandwidth 
matrix H has to be diagonal. Hence, we are able to handle correlation between components, 
as we allow kernels to have an arbitrary orientation (Wand and Jones, 1995). As we are 
especially interested in checking whether a large portion of the probability mass remains 
clustered around the 45-degree diagonal, this flexibility is relevant for us. If we were to im- 
pose a diagonal matrix H, only kernels which are oriented to the coordinate axes would be 


allowed. 


5.5 Results 


We address in this paper the question of regional “positive convergence" in Colombia (Mick- 
lewright and Stewart, 1999); that is, we investigate whether departments which were lagging 
behind at the beginning of the period in certain variables of interest proxying for health care 
and education have been able to catch up in a period that has been one of improvement in 


D ks is currently the most comprehensive kernel density estimation package in R (Duong, 2008). All the 


estimations were done with the function kde. 
l4We also tried direct plug-in methods for bandwidth selection suggested by Sheather and Jones (1991) and 


obtained results that are not very different. 
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average.!? As can be observed in the descriptive statistics of the variables (Table 5.1), there 
has been a general improvement in all the variables chosen as proxies for living standards. 


5.5.1 Literacy Rate 


Table 5.1 shows that between 1973 and 2005, the average departmental literacy rate (LIT) 
increased from 70 percent to 86 percent. We find strong evidence of convergence in literacy 
rates. Both sigma-convergence (Table 5.1) and beta-convergence are observed (Tables 5.2 
and 5.3). In the OLS regression with all available observations, Bogotá and La Guajira 
are identified as having an unusual influence on regression results; the coefficient of the 
initial level remains negative and statistically significant, however, if one excludes these two 
departments. In Bogotá literacy rates were already high in 1973 (90 percent), making it more 
difficult to achieve further improvements. La Guajira is in the opposite position, a department 
with very low literacy rates in 1973 and with a minor improvement in the 32 years of analysis. 
As suggested by Meisel (20072) in a study of this department, illiteracy is widespread in the 
indigenous population, consisting of the Wayüu group, which predominantly lives in rural 
areas. Estimations of this author show that around 80 percent of the Wayüus had not even 
finished primary school in 2005. 

The univariate density estimates in Figure 5.2 show that the distribution has narrowed 
between 1973 and 2005. The distribution in 2005 shows three modes, suggesting that some 
departments lag behind even if the dispersion decreased. According to the results of a boot- 
strap test of equality of the estimated densities for both years (Bowman and Azzalini, 1997) 
in Table 5.4, we can soundly reject the hypothesis that the two densities are identical. In Fig- 
ure 5.4, one can observe a clear pattern of convergence, as most of the mass of the estimated 
bivariate density is concentrated in an axis that is flatter than the 45-degree line. Neverthe- 
less, the case of La Guajira raises attention; it was among the worst relative performers in 
1973, and in 2005, it was the worst relative performer. La Guajira had literacy rates that were 
28 percent lower than the national average in 1973 and 33 percent in 2005. 

Although one can praise improvements for the other departments that were lagging behind 
in 1973, it is important to note that the literacy rate only indicates the existence of a basic 
education level, which is definitely important, but probably not adequate.!? Even consider- 
ing this very basic indicator, it is worrisome that in many departments (La Guajira, Choco, 


I5*Negative convergence,” (European Commission, 1996) a situation of general deterioration towards the stan- 


dard of the worst, is not relevant for Colombia in the period considered. 
16 Additionally, as discussed in Velez et al. (2003), there is some evidence that despite large and increasing 


public expenditures on public education, particularly after decentralization, the quality of education is de- 
creasing. 
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Sucre, Córdoba, Magdalena, Caquetá, Cesar, Nariño, and Bolivar), more than 15 percent of 
the population is still illiterate. Unfortunately, we have not been able to access data at the 
department level that proxies higher levels of education. 


5.5.2 Infant Survival Rate 


Between 1975 and 2000, the average departmental infant survival rate per thousand live 
births improved, increasing from 936 to 962 (Table 5.1). Results show that the standard 
deviation of this indicator decreased slightly, suggesting mild sigma-convergence. However, 
this result tells us little about how exactly the distribution of departments changed. 

When looking at beta-convergence, we find a negative coefficient for infant survival rate 
in 1975, but it is not statistically significant unless we exclude the department of Chocó from 
the regression (Tables 5.5 and 5.6). This exclusion is based on unusual influence or leverage 
and indicates that this department has an important influence on the lack of convergence. 
Chocó had the lowest performance in 1975 and even though infant survival rate improved 
there, by 2000, it was the furthest from the departmental average. Notice that the infant 
survival rate in Chocó in 2000 was lower than the departmental average in 1975 (Table 5.1). 

Figure 5.5 shows that after Chocó, departments with the lowest starting infant survival 
rates were Narifio, Caldas, Risaralda, and Quindío. Of these, Caldas, Risaralda, and Quindío 
achieved the largest improvements during the period of analysis. Changes in departments 
that were close to the average in 1975 vary considerably. Antioquia, Norte de Santander, 
and Valle experienced important improvements, while Cauca, Córdoba, Guajira, and Bolívar 
stagnated. 

Kernel density estimates allow a closer look at changes in the distribution in relative 
terms. In Figure 5.6 we observe that the density was slightly narrower in 2000 than in 
1975. Both years had a bimodal distribution. The bootstrap test of equality of the estimated 
densities for both years shows that we cannot reject the hypothesis that the two densities 
are identical (Table 5.4). The bivariate kernel in Figure 5.7 and the corresponding contour 
plot in Figure 5.8 suggest persistence in the infant survival rate, as most of the estimated 
density is concentrated along the 45-degree diagonal. Thus, in relative terms, departments 
basically remained where they were in 1975. The departments of Chocó, Nariño, and Cauca 
are outside of the 90 percent contour. 

To summarize, coastal regions have the lowest infant survival rates, although the rates 
have improved over time. The infant survival rate is particularly low in 2000 in the Pacific 
region (e.g., 912 children per 1,000 births in Chocó) where population density is low and 
a large share of the population lives in rural areas with precarious sanitation infrastructure. 
This department is also prone to a higher prevalence of tropical diseases, given that it is one 
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of the rainiest and humid regions in the world. Transport between small rural villages along 
the river shores, which are prone to floods, is possible only within the extensive network of 
rivers. This also explains why the the scope and reach of health programs are limited.! In 
contrast, an important increase in infant survival rates has been observed in central regions 
located along the main transport corridors of the country. Increased vaccination was achieved 
in large cities and in departments where agglomeration around urban centers allows easier 


access to the population. 


5.5.3 Life Expectancy at Birth 


In Colombia, the average departmental life expectancy at birth was 62 years in 1975 and 
increased to 70 years in 2000 (Table 5.1). The standard deviation decreased, suggesting 
sigma-convergence. Beta-convergence analysis shows a negative relationship between the 
starting value in 1975 and its change up to 2000 (Figure 5.9, Tables 5.7 and 5.8). The 
regression coefficient is negative and statistically significant when including all observations, 
but it is insignificant once we exclude influential observations, in this case Chocó and Nariño, 
which experienced large improvements. In Narifio, people born in 2000 were expected to live 
14 years longer than those born in 1975, and in Chocó the gain was 13 years. Once again 
the coffee-growing region (Caldas, Quindío, and Risaralda) had outstanding improvements. 
Sucre, located in the Atlantic coast, had the longest life expectancy at birth by 2000, even 
exceeding that of the capital district of Bogotà, which ranks frequently in first place for 
economic and well-being indicators. On the contrary, the set of departments that we group as 
Nuevos Departamentos in the eastern and southern parts of the country experienced modest 
improvements, taking into account their low starting position. 

The univariate kernel density estimates of life expectancy at birth for 1975 and 2000 
in Figure 5.10 show that the distribution has become narrower, as was expected from the 
sigma-convergence result. Even if both distributions seem bimodal, the mode on the left 
of the distributions is much closer to the main mode in 2000 than in 1975. Nevertheless, 
according to the bootstrap test of equality of the estimated densities for both years in Table 
5.4, we cannot reject the hypothesis that the two densities are identical. 

In Figures 5.11 and 5.12, we observe the bivariate kernel density estimator, which is com- 


puted using life expectancy at birth relative to the national average for both years. Once 


According to the Demographic and Health Survey from 2005, 20 percent of women in La Guajira did not 
have any kind of prenatal care before delivery. These rates are also very high for Caquetá (20 percent), Cauca 
(15 percent), Chocó (15 percent), and Córdoba(14 percent). In Chocó, 40 percent of births were attended 
at home (usually by a midwife). The corresponding figures for Caquetá and Cauca are 32 percent and 31 
percent, respectively (Profamilia, 2005). 
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again the results suggest persistence, rather than convergence, as most of mass of the es- 
timated bivariate density is concentrated along the 45-degree diagonal. In relative terms, 
departments basically remained where they were in 1975. Chocó, Narifío, and Nuevos are 
the three departments outside of the 90 percent contour. As mentioned before, Chocó and 
Narifio improved dramatically in their relative positions, while Chocó remained the worst 


performer both years. 


5.5.4 Nourishment 


As was mentioned before, we must treat the results concerning nourishment with care, as we 
only have data for 1995 and 2005, a period which is shorter than that used for other variables, 
and as we are using data for 23 departments!$ from Demographic and Health Surveys which 
are not always representative at the department level. Nevertheless, we consider that some 
insight can be gained investigating convergence in the rate of well-nourished children. 

Table 5.1 shows that between 1995 and 2005, the departmental average of the well- 
nourished rate (WR) improved from 93 percent to 95 percent. Both sigma-convergence 
and beta-convergence are observed. Figure 5.13 plots the value of WR in 1995 against the 
change in WR between 1995 and 2005. There is a clear negative relationship between them 
which is confirmed with the regressions presented in Tables 5.9 and 5.10. 

Figure 5.14 shows the univariate kernel density estimates of the log of relative WR and 
confirms that the distribution is less skewed in 2005, but the bimodality observed in 1995 
remains in 2005. The bootstrap test of equality of the estimated densities for both years 
suggests that there are no systematic differences between them (Table 5.4). Bivariate kernel 
density estimators shown in Figures 5.15 and 5.16 suggest mild convergence in this indicator. 
Most of the mass of the estimated bivariate density is concentrated in an axis that is flatter 


than the 45-degree line. 


5.6 Conclusions 


Several points are important for the discussion. First, unlike in the case when one is deal- 
ing with income indicators, social indicators have natural upper bounds (Neumayer, 2003; 
Kenny, 2004). In the case of the three rates we use (infant survival rate, literacy rate, and 
well-nourished rate), the upper bound is evident and constant (e.g., no department can have a 
more than 100 percent literacy rate). In the case of life expectancy at birth, the upper bound 


18No information is available for Caquetá and Nuevos Departamentos. 
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can be thought of as variable, but a slow moving one.!? As discussed by Neumayer (2003) 
and Kenny (2004), among others, convergence is more likely to be observed in variables 
with an upper bound when some departments are close to that bound in the initial year. Our 
results show that this is the case for the classical approach to convergence, where sigma- 
convergence is found in all cases, and beta-convergence also in all, with the exception of the 
infant survival rate. Working with relative values and using the distributional approach, it is 
possible to make a more precise evaluation if one uses bivariate kernel estimators. For ex- 
ample, the distributional approach yields persistence in the distribution of life expectancy at 
birth, while the beta-convergence analysis shows convergence. Obviously, regression results 
can be driven by outliers, as is the case for this variable. If one excludes outliers, no evidence 
of beta-convergence is observed. 

Second, it has been argued that at least at the cross-country level, one can observe con- 
vergence in social indicators even in the absence of convergence in income. Two reasons 
advanced for this are high returns to small marginal increases in income at low income levels 
and causal relationships between the social indicators, themselves (Kenny, 2004), for exam- 
ple between literacy and mortality. Additionally, the dispersion of best practices in health 
care can lead to improvement in health outcomes, even without income convergence. This is 
relevant for us, as results in Essay 4 suggest that there has been persistence in the distribution 
of departmental per capita GDP between 1975 and 2000.29 

Third, the role of urbanization should also be highlighted, as it is relevant in our case. The 
percentage of total population in Colombia that lived in urban areas increased from 59 per- 
cent in 1973 to 75 percent in 2005. Urbanization facilitates convergence in social indicators, 
as it is easier to provide social services to urban residents than to rural (Kenny, 2004). As 
shown by Bettencourt et al. (2007), cities make economies of scale in infrastructure possible. 
Nevertheless, we stress that living in an urban area does not guarantee access to services. 

Fourth, departments in Colombia differ according to climatic and geographic conditions, 
which conditions have been historically determinant for agglomeration and the availability 
of infrastructure, particularly roads. Two important consequences are that some diseases 
only affect departments that are located in the tropics and that access to sanitation services 
is more limited in isolated areas. 
19There is no consensus among scientists concerning the upper bound of life expectancy at birth and values 

between 85 years (which is the value currently used in the Human Development Reports) and 100 years have 
been mentioned. Olshansky et al. (1990), for example, claim that it seems unlikely that life expectancy at 
birth will exceed the age of 85, while other studies, based on extrapolations from historical trends, predict 
that it could attain 100 years in developed countries by 2060 (Oeppen and Vaupel, 2002) or by 2300 (United 


Nations, 2004). 
20PDB is the abbreviation used in Essay 4 for departmental GDP, based on the Spanish denomination “Producto 


Departamental Bruto" (CEGA, 2006b). 
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Fifth, and related to the last point, the internal conflict in Colombia is not of the same 
intensity across all departments?! Although it is a widespread problem affecting the whole 
country, sparsely populated departments, with predominantly rural areas having limited ac- 
cess due to geographic conditions and in which state presence has been historically low or 


weak, are more prone to the presence of illegal groups. 


With these issues in mind, we begin the discussion of the results concerning convergence 
in education. In the final quarter of the last century, we find clear evidence of regional conver- 
gence in literacy rates in Colombia. As discussed by Barrera and Domínguez (2006), since 
the seventies, there have been policies in place having the objective of reducing illiteracy 
and increasing the coverage of primary education, which policies are partially responsible 
for our result. Urbanization probably also played a role here. While some of the departments 
with the best indicators in 1973 had a population that was primarily urban, many of the worst 
performers in that year had a population that was mainly rural and urbanization increased rel- 
atively more in the worst performers.2? Another possible reason for the convergence result is 
that our indicator considers the population above five years of age in both periods. The elder 
population in 1973, which was no longer alive in 2005, likely had relatively high illiteracy 
rates, particularly in the departments which were lagging behind in 1973. Thus, there is a 
generational effect; the older cohorts included in 1973 are no longer visible in the statistics 
in 2005, while younger cohorts, who benefited from improved educational resources and 


literacy campaigns, are included. 


Even if regional convergence in literacy rates seems to be a robust result, one should keep 
in mind that it only reflects very basic education and we do not know whether convergence 
was observed in education at higher levels. The department of La Guajira deserves special 
attention as it still lags far behind in literacy rates.?? 


21The current internal conflict began almost 50 years ago with the emergence of leftist guerrilla groups, the 
root motivations of which were mainly ideological. Up until 1980, the military capacity of these groups 
was limited and was concentrated in marginal areas of the country. Parallel to those, paramilitary groups 
developed slowly in the eighties to defend isolated areas from guerrilla attacks. During the coca bonanza 
(“bonanza coquera") and the consolidation of drug trafficking in the eighties, illegal armed groups found new 
ways of financing operations and expanding through the control of areas where illegal crops where grown, as 


well as in territories that are rich in natural resources, particularly oil (Díaz and Sánchez, 2004). 
22 As examples for well performing departments in terms of literacy rates in 1973, consider Antioquia, which 


had a share of urban population of 62 percent in 1973 and 78 percent in 2005, and Valle, which had 76 percent 
in 1973 and 87 percent in 2005. Examples for poorly performing departments in 1973 are Chocó, which had 
a share of urban population of 26 percent in 1973 and 54 percent in 2005, and Sucre, which had 47 percent 


in 1973 and 64 percent in 2005. 
23 As mentioned before, this department has a large indigenous population (44 percent) of which population 80 


percent has not attained any educational degree (Meisel, 2007a). 
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Considering indicators of the health status of the population and using infant survival rates 
(ISR) as a proxy, we find no evidence of beta-convergence. Following the distributional ap- 
proach and values expressed relative to the national levels, we find evidence of persistence in 
the distribution. In 2005 departments are basically where they were in 1973, in relative terms. 
These results can also be explained by a low prevalence of prenatal care in departments of 
the Pacific and Atlantic coasts (Profamilia, 2005). 


Using life expectancy at birth as a proxy for health yields similar results (persistence) 
according to the distributional approach. In relative terms, in the year 2005, departments are 
basically where they started in 1973. However, the two departments that had the lowest val- 
ues at the beginning of the period improved substantially in relative terms. Beta-convergence 
is driven by these two departments. 


Results regarding nourishment show regional convergence in the 10 years studied (1995 
through 2005). (As previously mentioned, estimates have to be considered cautiously be- 
cause the sample of 1995 is not representative for all departments.) Two interesting issues 
emerge. First, the well-nourished rate deteriorated slightly for some departments that were 
close to the upper bound in the initial year. Second, the three poorest departments in the 
Pacific coast improved considerably in relative terms, as pointed by Profamilia (2005). In- 
terestingly, in the Atlantic coast, nourishment relative to the national average stagnated. 


Our results differ from those in the scarce literature on convergence among departments in 
Colombia which used health and education indicators. Some of our results contradict those 
obtained by Aguirre (2005), who finds convergence in life expectancy at birth, but not in ed- 
ucation. The difference concerning life expectancy can be explained as follows. We find that 
beta-convergence is driven by two influential observations, and once we use bivariate kernel 
estimators with the variable expressed as a ratio to the the national value, we find persistence 
in the distribution of life expectancy rather than convergence. The difference concerning 
education could be due to the fact that Aguirre uses illiteracy rate for the analysis, while we 
use literacy rates. Note that while Aguirre also computes univariate kernel density estimators 
for the variables in both periods, they are based on absolute values (i.e., not relative to the 
national average), making judgements as to distributional changes more difficult given that 
the means vary. 


Finally, the study of Meisel and Vega (2007) considers a much longer period (1870 
through 2003) for investigating regional convergence in Colombia, using the evolution of 
adult height over time as an alternative perspective on the standard of living. They find both 
sigma and beta-convergence among departments and highlight that nutritional improvements 
are among the main explanations of this result. Even if our results concerning nourishment 
cover a much shorter period (1995 through 2005), they point to the same conclusion. It could 
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be interesting to test for convergence among departments with the distributional approach us- 
ing the sample of Meisel and Vega (2007). 

Lack of convergence among Colombian departments in the two variables proxying for 
health raises some doubts as to the effectiveness of current policies, as convergence is what 
one would expect, for the reasons explained above. Understanding why some departments 
still lag behind is relevant, as it is not clear whether the reasons are due mainly to the differ- 
ences in per capita income, climatic and geographic conditions, infrastructure, or behavior, 
or to still other factors. It is crucial to understand the main causes of infant and adult morbid- 
ity and mortality in the departments lagging behind to assess which specific policies could 
improve living conditions in each case. 


Summary of Results 
Social indicator used 

Literacy Infant Life expectancy Well-nourished 

rate survival rate at birth rate 
Classical Approach: 
Convergence? 
Sigma Yes Yes Yes Yes 
Absolute Beta Yes No Yes Yes 
(all obs.) 
Absolute Beta Yes Yes No Yes 
(excl. outliers) 
Distributional Approach 
Univariate Dispersion Distribution Distribution Distribution 
Kernel Estimators decreases unchanged unchanged unchanged 
Bivariate Convergence Persistencein Persistence in Suggests slow 
Kernel Estimators the distribution the distribution convergence 


Note: Results for the distributional approach based on relative values, i.e. ratios to the national level. 
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5.7 Tables 


Table 5.1: Descriptive Statistics of the Variables Used 


Variable Literacy Infant Life Well- 
rate survival expectancy nourished 
rate at birth rate 
Year 1973 2005 1975 2000 1975 2000 1995 2005 
mean 70.3 85.8 935.9 962.3 61.7 69.9 93.0 953 
median 69.0 87.4 940.1 967.8 62.5 69.9 93.8 957 
stand. deviation 9.38 6.8 166 144 36 2.6 42 23 
Skewness -0.0 -1.8 20  -20 -1.6 -1.0 -14 -0.9 
kurtosis 29^ 63 7.8 7.6 66 44 48 39 
range 37.3 30.2 81.3 648 17.6 11.5 178 9.8 
minimum 522 63.1 876.7 911.5 49.7 62.4 809 88.9 
maximum 89.6 93.4 958.1 976.3 67.3 73.8 98.7 98.7 


Number of obs. 25 25 24 24 24 24 23 23 


Source: Own calculations based on data at the department level from DANE and DHS. 
Literacy rate measured as the percentage of literate population above age 5. 

Infant survival rate measured as per thousand live births. 

Life expectancy at birth measured in years. 


Well-nourished rate measured as the percentage of population under age 5 not underweight. 
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Table 5.2: Beta Convergence (OLS) using all available Observations. Dependent Variable: 
Change in Literacy Rate (LIT) between 1973 and 2005 


Robust 
Variable Coefficient Std. error p-value 
Intercept 43.42 9.56 0.000 
LIT in 1973 -0.40 0.13 0.005 
Number of observations 25 
R-squared 0.50 


Source: Own calculations based on data from DANE. 
Note: HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 


Table 5.3: Beta convergence (OLS) excluding Outliers. Dependent Variable: Change in 
Literacy Rate (LIT) between 1973 and 2005 


Robust 
Variable Coefficient Std. error p-value 
Intercept 49.36 5.93 0.000 
LIT in 1973 -0.47 0.08 0.000 
Number of observations 23 
R-squared 0.81 


Source: Own calculations based on data from DANE. 
Notes: Two departments (Bogotá and La Guajira) were excluded using Cook's distance 
to detect for unusual influence or leverage after the regression with all observations. 


HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 
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Table 5.4: Bootstrap tests of equality of univariate density estimates, Social indicators 


Variable Years compared p-value 
LIT 1973, 2005 0.004 
ISR 1975, 2000 0.968 
LEX 1975, 2000 0.733 
WR 1995, 2005 0.118 


Test proposed by Bowman and Azzalini (1997). 1000 replications. 
Source: Own calculations based on data from DANE and DHS. 


Table 5.5: Beta Convergence (OLS) using all available Observations. Dependent Variable: 
Change in Infant Survival Rate (ISR) between 1975 and 2000 


Robust 
Variable Coefficient Std. error p-value 
Intercept 370.75 297.25 0.257 
ISR in 1975 -0.37 0.32 0.225 
Number of observations 24 
R-squared 0.28 


Source: Own calculations based on data from DANE. 
Note: HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 
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Table 5.6: Beta Convergence (OLS) excluding Outliers. Dependent Variable: Change in 
Infant Survival Rate (ISR) between 1975 and 2000 


Robust 
Variable Coefficient Std. error p-value 
Intercept 660.00 195.01 0.003 
ISR in 1975 -0.68 0.21 0.004 
Number of observations 23 
R-squared 0.41 


Source: Own calculations based on data from DANE. 
Notes: One department (Chocó) was excluded using Cook's distance to detect 
for unusual influence or leverage after the regression with all observations. 


HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 


Table 5.7: Beta Convergence (OLS) using all available Observations. Dependent Variable: 
Change in Life Expectancy at Birth (LEX) between 1975 and 2000 


Robust 
Variable Coefficient Std. error p-value 
Intercept 34.69 6.67 0.000 
LEX in 1975 -0.43 0.11 0.001 
Number of observations 24 
R-squared 0.48 


Source: Own calculations based on data from DANE. 
Note: HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 
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Table 5.8: Beta Convergence (OLS) excluding Outliers. Dependent Variable: Change in Life 
Expectancy at Birth (LEX) between 1975 and 2000 


Robust 
Variable Coefficient Std. error p-value 
Intercept 27.48 10.46 0.016 
LEX in 1975 -0.32 0.17 0.070 
Number of observations 22 
R-squared 0.19 


Source: Own calculations based on data from DANE. 
Notes: Two departments (Chocó and Nariño) were excluded using Cook's distance after the 
regression with all observations to detect for unusual influence or leverage. 


HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 


Table 5.9: Beta Convergence (OLS) using all available Observations. Dependent Variable: 
Change in Well-nourished Rate (WR) between 1995 and 2005 


Robust 
Variable Coefficient Std. error p-value 
Intercept 0.86 0.23 0.001 
WR in 1995 -0.90 0.24 0.001 
Number of observations 23 
R-squared 0.74 


Source: Own calculations based on data from DHS. 
Note: HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 
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Table 5.10: Beta Convergence (OLS) excluding Outliers. Dependent Variable: Change in 
Well-nourished Rate (WR) between 1995 and 2005 


Robust 
Variable Coefficient Std. error p-value 
Intercept 0.99 0.16 0.000 
WR in 1975 -1.04 0.17 0.000 
Number of observations 21 
R-squared 0.68 


Source: Own calculations based on data from DHS. 
Notes: Two departments (Chocó and La Guajira) were excluded using Cook's distance after the 
regression with all observations to detect for unusual influence or leverage. 


HC3 robust standard errors calculated following Davidson and MacKinnon (1993). 
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Table 5.11: Poverty Headcount Index (% of Households below the Poverty Line) by Depart- 
ment. 1996 to 2005 
Department 1996 1999 2002 2003 2004 


Antioquia 54.1 57.8 58.9 55.6 54.1 
Atlántico 49.5 57.9 53.2 52.1 48.2 
Bogotá 37.8 46.3 36.1 34.2 29.5 
Bolivar 60.3 59.9 67.8 51.5 54.6 
Воуаса 65.0 62.6 72.3 70.3 71.5 
Caldas 54.3 54.0 59.6 58.8 57.7 
Caquetá 58.0 59.9 53.5 54.5 56.8 
Cauca 61.4 73.3 64.5 69.0 63.0 
Cesar 49.7 53.7 67.2 61.6 59.3 
Chocó 70.9 78.0 62.6 70.3 71.6 
Córdoba 74.3 72.6 68.5 66.5 70.8 
Cundinamarca 44.5 50.9 58.4 51.9 53.6 
Huila 62.9 62.7 74.4 69.7 66.3 
Guajira 49.3 50.2 68.4 54.6 52.8 
Magdalena 62.6 62.7 66.4 55.4 55.0 
Meta 50.9 52.4 47.9 44.3 42.5 
Narifio 68.1 71.7 70.7 71.2 67.3 
N.Santander 61.3 58.2 513 313 57.9 
Quindío 45.4 51.4 49.3 41.3 47.3 
Risaralda 52.1 51.5 47.9 45.3 44Л 
Santander 48.9 55.4 50.2 48.6 48.6 
Sucre 47.9 64.0 69.4 56.5 65.7 
Tolima 59.3 58.4 60.6 58.8 60.1 


Valle 47.4 47.6 44.1 37.4 38.9 


Source: Own calculations based on Household Surveys from DANE. 
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S.8 Figures 


Figure 5.1: Evolution of Literacy Rate. 1973-2005 


Change in literacy rate 1973-2005 
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Literacy rate in 1973 


Source: Own calculations based on data from DANE. 
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Figure 5.2: Univariate Kernel Density Estimators of Relative Literacy Rate. 1973 and 2005. 


Density function 
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Source: Own calculations based on data from DANE. Variables relative to the national average and in logs. 
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Figure 5.3: Bivariate Kernel Density Estimators of Relative Literacy Rate. 3D Representa- 
tion. 1973 and 2005. 


Source: Own calculations based on data from DANE. Variables relative to the national average and in logs. 


190 5. REGIONAL CONVERGENCE IN COLOMBIA: SOCIAL INDICATORS 


Figure 5.4: Bivariate Kernel Density Estimators of Relative Literacy Rate. Contour Plot. 
1973 and 2005. 


Log of relative LIT in 2005 
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Source: Own calculations based on data from DANE. Variables in logs. Contours are drawn at 30%, 60% and 
90%. The points represent the 25 observations. Points outside the contour level curves are identified. A 45 
degree line is added to the plot. 
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Figure 5.5: Evolution of Infant Survival Rate. 1975-2000 
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Source: Own calculations based on data from DANE. 
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Figure 5.6: Univariate Kernel Density Estimators of Relative Infant Survival Rate. 1975 and 
2000. 


Density function 
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Source: Own calculations based on data from DANE. Variables relative to the national average and in logs. 
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Figure 5.7: Bivariate Kernel Density Estimators of Relative Infant Survival Rate. 3D Repre- 
sentation. 1975 and 2000. 


Source: Own calculations based on data from DANE. Variables relative to the national average and in logs. 
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Figure 5.8: Bivariate Kernel Density Estimators of Relative Infant Survival Rate. Contour 
Plot. 1975 and 2000. 
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Source: Own calculations based on data from DANE. Variables in logs. Contours are drawn at 30%, 60% and 
90%. The points represent the 25 observations. Points outside the contour level curves are identified. A 45 
degree line is added to the plot. 
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Figure 5.9: Evolution of Life Expectancy at Birth. 1975-2000 


Change in life expectancy at birth 1975-2000 
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Source: Own calculations based on data from DANE. 
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Figure 5.10: Univariate Kernel Density Estimators of Relative Life Expectancy at Birth. 
1975 and 2000. 
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Source: Own calculations based on data from DANE. Variables relative to the national average and in logs. 
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Figure 5.11: Bivariate Kernel Density Estimators of Relative Life Expectancy at Birth. 3D 
Representation. 1975 and 2000. 


Source: Own calculations based on data from DANE. Variables relative to the national average and in logs. 
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Figure 5.12: Bivariate Kernel Density Estimators of Relative Life Expectancy at Birth. Con- 
tour Plot. 1975 and 2000. 
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Source: Own calculations based on data from DANE. Variables in logs. Contours are drawn at 30%, 60% and 
90%. The points represent the 25 observations. Points outside the contour level curves are identified. A 45 
degree line is added to the plot. 


5.8. FIGURES 199 


Figure 5.13: Evolution of Well-Nourished Rate. 1995-2005 
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Source: Own calculations based on data from DHS 
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Figure 5.14: Univariate Kernel Density Estimators of Relative Well-nourished Rate. 1995 
and 2005. 
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Source: Own calculations based on data from DHS. Variables relative to the national average and in logs. 
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Figure 5.15: Bivariate Kernel Density Estimators of Relative Well-nourished Rate. 3D Rep- 
resentation. 1995 and 2005. 


Source: Own calculations based on data from DHS. Variables relative to the national average and in logs. 
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Figure 5.16: Bivariate Kernel Density Estimators of Relative Well-nourished Rate. Contour 
Plot. 1973 and 2005. 
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Source: Own calculations based on data from DHS. Contours are drawn at 30%, 60% and 90%. The points 
represent the 25 observations. Points outside the contour level curves are identified. A 45 degree line is added 
to the plot. 
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Objectives, Properties and Proofs 


In this appendix, we present the objectives and properties that we consider relevant for any 
composite index related to social institutions related to gender inequality. Moreover, we 
show that the proposed index fulfills all of them. 


We use the following notation. Let X’, with j = A, B, be the vector containing the the val- 
ues of the subindices x’, with i = 1,...,n, for the country j^^. I(X) represents the composite 


index. 


Objectives of the Index 


The objectives of the index are the following: 


]. The index /(X) should represent the level of gender inequality, so that countries can 
be ranked. 


2. The interpretation of /(X) should be straightforward. As in the case of the subindices 
xj, the value 0 should correspond to no inequality and the value 1 to complete inequal- 


ity. 


3. For any subindex x;, we interpret the value 0, i.e. no inequality, as the goal to be 
achieved. The value zero can be thought of as a poverty line (see Ravallion, 1994; 
Deaton, 1997; Subramanian, 2007, and references therein). We define a deprivation 
function ф (x;,0), with $(x;,0) > 0 if x; > 0, and $ (x;,0) = 0 if x; = 0. Higher values 
of x; should lead to a penalization in /(X) that should increase with the distance x; to 


IX) a^) 
Zero, i.e. 22>0,a and 7557 > 0. 


4. I(X) should not allow for total compensation among subindices, but permit partial 
compensation. This somehow relates to the transfer axioms that should be fulfilled by 
inequality as well as poverty measures (see Atkinson, 1970; Kakwani, 1984; Shorrocks 
and Foster, 1987; Subramanian, 2007; Alkire and Foster, 2008, and references therein). 
Assuming that two subindices have the same value, a decrease in x;, i.e. less inequality, 
is rewarded more in /(X) than an equivalent increase in the other subindex ху. 


I(X) should be easy to compute and transparent. 


24In what follows, the superscript j will only be used if it is necessary to distinguish countries. 
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Properties of the Index 


Some of the properties that any index should fulfill are: 
1. Support and range of /(X): 


e I(X) must be defined for 0 € x; € 1, i=1,...,n. 
e 0 X I(X) < 1 must hold for any X. 
e If x; = 0 Vi, then I(X) = 0. If x; = 1 Vi, then I(X) = 1. 
2. Anonymity (symmetry): The value of /(X’) does not depend either on the names of 


the subindices nor on the name of the country (7). 
3. Unanimity (Pareto Optimality): If x? < x? Vi, then 1(X4) < I(X?). 


4. Monotonicity: If considering X4 and X? country A is preferred to country B, and only 
x“ improves (i.e. decreases) for a given i, while x? Vi remains unchanged, then country 


A should still be preferred over country B. 


5. Penalization of inequality in the case of equal means: Let the mean of X^ be equal 
to the mean of X?. If the dispersion of X^ is smaller than the dispersion of X7, then 
I(X4) <1"). 


6. Compensation property: In an example with two subindices, let x = x2 > 0, Ax] < 
1 — x1, and Ax, € 1 — хо. Assume that x, increases by |Axı| and x2 decreases by | Ax] 
(a) If |Axı| = |Ax»|, then /(X) must increase. 


(b) For /(X) to remain unchanged, we must have |Ax2| > |Axı. 


Proofs 
The composite index 7(X) is defined as 
12 2 
I(X) = > Mx; - 0)". 


i-l 


The index proposed fulfills all the stated properties. 


1. Support and range of /(X) 


e 1(X) is defined for 0 < x; < 1, i= 1,...,n. 
e For any X, we have that 0 < /(X) < 1. 
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e If x; 0 Vi, then I(X) = 0. If x; = 1 Vi, then I(X) = 1. 


2. Anonymity (symmetry) 
The value of /(X7) does not depend either on the names of the subindices nor on the 
name of the country (7). 

3. Unanimity (Pareto Optimality) 


If we assume that Vi 


хі Ex), 
then we can show that 
(xf)? < o 
; Let - oy < ¿Y B_ 9)? 
1) < AX). 
4. Monotonicity 
We assume that 
1(X*) < (X°) 
So -0y < LY 4-0? 


Let us suppose, without loss of generality, that subindex ху improves (decreases) by 
ô > 0 for country A. Then we have that 


Lf 5-0)? += 3-0? < 15-0}, 
n n i ni 
and hence 
ПИТИ мы = ded s 
zd - à Ve 0? < 3» 0) 
This means that 


Ix^) < IX?) 


with X^' defined as the vector corresponding to country A with only one subindex 


having improved (decreased) by 6. 
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5. Penalization of inequality in the case of equal means 


If we assume equal means, so that 


then we also have 


If we assume that the variance of X^ is smaller than the variance of X? so that 


- Y of e 
i=1 


we can show that 


У [ty — 2px} +12) 
i-l 
Yt? -2u ум n 
i-1 
As У" (х1) = X24 (x2), we have that 
Im) 
"So? 
nj 
Ха) 
6. Compensation property 


- Y 9 - ny, 
i=] 


xi). — 2px + w^]. 


xP)? 2р Y tnp. 
= 


^ 


In an example with two subindices, let xy = x2 =x > 0, Axı € 1 — xı, and Ax; < 


1—хэ. 


(a) We can show that if Ax, 


= Ax = 6 > 0, then 


x < x,+6 
0 « —x2+6 
0 < 286(x,—x2 +0) 

ATA < +05 +28(01 —x2 +0) 
(d+) < ; Of + 26x) + 8? +34 - 2) 8?) 
(+3) < 5 [608 (63-8) 

I(xi,xo) < I(xi-6,x2 — 6), 
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and hence we have shown that if x; increases by б and x2 decreases by 6, then 
I(X) must increase. 


(b) We will show that if xı increases by Ax, and xz decreases by Ax) and the value 


of the index remains unchanged, the increase of xy must be smaller than the 
absolute value of the decrease in x2. 


IH(x,x) = I(x + xy, x2 — Axa) 
5 Es +%) = > [Ge + Ax)? + (x2 — Ax] 
xp = 042010 + (Ат) +23 202 Ax + (Am)? 


0 = 2x, Ax, + (Ary)? – 2x2 Ax + (Ax)? 
Using the fact that x; = хә = x, we can rewrite this as 


0 
0 


2xAx + (Ax)? — 2xAx + (Ax)? 
2x(Axı — Ax) + (A)? + (Ax). 


As 2x > 0, (Axı)? > 0, and (Ax)? > 0, we must have that 


Ax, — Ax» « 0 
Ax, < Am. 
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